RFC: Crushinator Probe/Interrogation Implementation Details (branching questions, skipping, validation, etc)

Introduction

The Crushinator project aims to replace PasteScript as a de facto standard for developing boilerplate code generation systems. My interest in starting the project came from use of and work on ZopeSkel/templer.core. There are several issues with the way that PasteScript solicits values from the user that are restrictive to the developer and often frustrating for the user.

If you’re not familiar with how PasteScript works, you typically invoke it via the paster create sub-command. You then get prompted for a series of values that correspond to template values. It works in a linear fashion. The values may be manipulated prior, but are essentially passed verbatim to the underlying template

For more background, see the design document.

Terminology Clarification

‘user’ vs ‘developer’

When I talk about users here, I mean someone who invokes the paster (or, zopeskel/templer) command to run a code generation ‘template’. When I talk about developers I’m talking about folks that write the ‘template’ code.

‘template’ vs Template vs template

There’s a confusing bit of nomenclature in PasteScript regarding the word ‘template’. Typically when people talk about templates, they mean a predefined structure; most people think of Word templates or cut pieces of plastic or paper used to guide a pen or saw. In both cases, there’s an implication of structure and repeatability.

When programmers talk about templates, they typically mean text files with special syntax; these text files provide structure and repeatability as in the common sense, but templates for programmers are typically filled with place holders, and often simplified logic. Examples are vast, but include Cheetah, Jinja2, Smarty, Mason, etc.

In PasteScript, Template is a base class that handles the interaction between the user and (typically) a template (in the programmer sense). It also does some things with treating code layouts like templates, doing additive injections of code, etc.

Separating these concerns and quelling this confusion is a core goal of the Crushinator project. That said, the only real frame of reference anyone reading this will likely have is PasteScript, so when I use the term ‘template’, I’ll put it in quotes, and I’ll be referring to templates in the PasteScript sesne. If I don’t quote it, assuming it wasn’t just a typo, I mean it in the programmer sense.

Specific Issues This Document Addresses

This section covers the specific problems with PasteScript that are being addressed by the Probe/Interrogation implementation, and the new concepts and features that will also be implemented.

Skipping

We’ve often found it useful to skip some prompts. Lets say the user enters a response to a prompt, and that implies some sane defaults. It’s difficult for the developer to handle this situation with the core PasteScript library (ZopeSkel and templer.core have overcome this on their own).

A concrete example: there are about a half-dozen template variables in the python egg-related templates (package, namespace, nested-namespace) that deal with package metadata (license, author, keywords, etc). These values are strictly optional in the sense of building the code skeleton successfully and creating a working package. Aside from the ZopeSkel hacking around the issue, there’s no way to just ask the user “do you want to set metadata now?”, and if they say “no”, skip all of those questions, substituting sane defaults. (See ZopeSkel’s ‘easy mode’ to see this in practice)

This happens because the prompting happens independently of the processing and template transformation. You have to wait until the user is finished answering all of the questions before you can act.

Validation

In PasteScript, there is no ‘baked-in’ facility for checking a value from the user. If you do detect an error, you have to rerun the prompt loop, or intervene on your own.

This has caused some pain; aside from the obvious (e.g. the developer assumes a raw integer and the user enters ‘one’), often the values entered by the user will correspond directly to a python identifier, or a package name. A good developer will typically massage the value to make it conform, but it doesn’t always happen.

What’s worse, it can confuse the user. Say they entered “My Class”, understanding that the mangling will happen. They expected “MyClass”, but instead got “My_Class” due to an assumption on the developers part. I’d rather see an error here, or feedback about what happened (‘My Class’ is not a valid class name, is My_Class ok?’, or ‘You set the content type to “My Class”, so we’re going to call the class My_Class’, etc).

The way I look at it, the code generated by a tool like Crushinator should be ‘sound’ out of the box. The user should be able to understand what happened by looking at it, it shouldn’t do anything ‘magical’ for its own sake, and as much as possible, it should run without error.

Validation and allowing some sort of feedback to the user will alleviate these issues.

Branching

Building on the ‘skipping’ concept, this involves presenting the user with a whole new set of prompts given their specific response to a certain prompt. As far as I know, there is no current implementation of this concept in any PasteScript-based boilerplate generators.

An example: The user is building a new python package. It asks them, “would you like to set up Sphinx documentation”. If the user says “yes”, they are asked specific questions that are necessary for Sphinx to function. If they say “no”, they go on without it. A similar approach could be used for adding a zc.buildout structure.

Grouping

There’s utility in being able to arbitrarily group prompts, primarily in ‘alternative’ front ends, such as the web or desktop. This could be implemented as a tabbed interface, or a ‘wizard’, or just allow the developer to provide more guidance to the user by giving a description and/or help-related text that covers a subset of a larger suite of prompts.

To my knowledge, this is not implemented in PasteScript.

More information

In PasteScript, developers are limited to a single, simple string that gets passed to the user when solicited for a value. Often developers want to provide a more detailed explanation for what the value means, what its implications are, etc.

Crusinator will provide several mechanisms for the developer to communicate with the user (but most of these will be optional)

Markup in Prompts

Currently there’s no way to effectively provide emphasis, urls, or otherwise enrich the prompts as they’re given to the user. This limits both the amount of help the developer can give the user, and makes more robust user interfaces (web, desktop) more difficult to implement.

There will be a single markup standard used throughout Crushinator. This is independent from any template processors that the developer uses.

Markup would apply to any text that is presented to the user.

Internationalization

i18n is not currently a common component of PasteScript (I don’t think it’s impossible to implement it with gettext, but AFAIK nobody’s doing it). As PasteScript-based generators have often become the go-to ‘quick-start’ for new users of a framework, the lack of this feature is a hindrance to acceptance.

It will be optional but recommended (using gettext makes it pretty easy to open the door to other folks translating it for you). However it is implemented, it will need to be able to handle the markup mentioned above. All text provided to the user will be expected to be passed through that translation mechanism.

User Feedback

When a value fails validation, it’s useful to tell the user why. Crushinator will expand on this concept, allowing user feedback when validation passes.

An example: The user is building a python egg, and they opt to not fill out the metadata. The developer can point the user to the generated setup.py file, and a link to the setuptools page explaining what the values mean, so they’ll have a pointer to what they’re supposed to do.

Implementation

Terms

In the design document, I’ve settled on a separation of many of the concerns and picked nomenclature to identify the parts. In this document, I’m focusing on the Probe, Interrogation, and (to a lesser extent) the Runner classes. There will be mention of the Collector and UserInterface classes, and fleeting mention of Skeleton and Injector, but the specifics of UserInterface, Skeleton and Injector are still in flux. Although Collector has a base implementation, it’s only utilized by the UserInterface, so there’s not much need for detail here.

A basic overview of the terms/base classes:

Probe: A singe piece of information required to complete the code generation task.
Interrogation: A collection of Probes.
UserInterface: Literally, a user interface. Provides user interaction.
Runner: A collection of Interrogations and Skeletons. Acts as intermediary; provides Probes to the User Interface, and template values to the Skeletons and Injectors.
Collector: Seeker of default values. Current base implementation pulls from a handful of setup files, and the command line

All of these classes reside in crushinator.framework. As typical implementations are built, they will be part of the crushinator.toolkit package.

Basic Communications

Crushinator itself provides a setuptools entry-point facility to register (plug-in) User Interfaces, Collectors, and Runners. The User Interface provides a plug-in point for Runners. In this way the Runner expresses compatibility with a given set of User Interfaces. The User Interface will also register any setuptools ‘console scripts’ that the user will use to invoke it.

Help Documentation

A user will invoke the User Interface via the console script. The User Interface must provide a help() method (and corresponding UI element or command line switch), which is expected to return a list of all Runners that have expressed compatibility with that User Interface. The help() method will also take a Runner class or string containing a dotted-package notation (e.g. ‘my.package.MyRunnerClass’).

The Runner class must also provide a help() method. When a Runner specifier is passed to the help() method for a User Interface, after look-up, this method is to be invoked by the User Interface. The help() method for a Runner will also take a specifier. In this case the value can be: a Probe class/package-notation, an Interrogation class/package-notation, a two-tuple  (of Interrogation class/package-notation, Probe name) or just a Probe name.

Probes and Interrogations will also (optionally) have a help() method, returning general text about what they do. This might not typically differ from the description property also expected, but is provided in the event that more context can be given outside of a short clarification as to what a Probe/Interrogation’s purpose or expected value is.

In all cases, (as mentioned earlier) it will be possible that the text will contain a lightweight markup.

The goal here is to encourage developers to include useful help at each level of the system, thus providing a better end-user experience. The user should be able to pull up help on a specific Runner, or a specific Probe and/or Interrogation with relative ease, and the User Interface should be able to provide such help text in an intuitive way.

User Interaction

The User Interface will provide some mechanism for a user to select a Runner they wish to work with. Default values will be pulled via invoking one or more appropriate Collectors. Collectors return a dictionary, where the keys are Runner class specifications (dotted package notation), and the values are dictionaries of individual Probe names and their default values. The defaults for a specific Runner are passed to the Runner’s constructor.

The Runner will internally maintain a collection of Interrogations, which, in turn, maintain a collection of Probes. The User Interface invokes the Runner’s next() and set() methods as it interacts with the user.

The next() method returns a Probe object. The set() method takes a value and a Probe specifier. This ‘commits’ the value back to the Runner. The set() method returns any success output from the Probe, or raises an exception upon error. This gives us a few useful API interactions:

  • Branching becomes possible. The Runner consults the Probe’s next() method to retrieve the next Probe (or Interrogation) that should be presented to the user. What the Probe returns is dependent on its current value.
  • Validation happens along the way. Each Probe implements a validate() method, which is invoked by the Runner’s set() method. The validate() method can also be used on its own to check a value before saving it.
  • The Interrogation also implements a validate() method, allowing for invariant checks and other sorts of group validation.
  • The User Interface can build dynamic interfaces by pre-polling the Probes without calling set(). This is necessary when building UIs for the web or desktop.
  • The user can get feedback about successes as well as errors.

Another feature in this communication is a high-level of autonomy. A Runner must provide a list of Probes to a User Interface outside of the next()/set() interaction, as well as a separate list of Interrogations. Each Probe or Interrogation can be invoked, validated, and then set() independently. This provides the sort of flexibility necessary for asynchronous UIs. Those lists will react to existing values just like the next() method does (the API here is still a bit cloudy, they may actually take values as a method argument). In this way the User Interface can present the user with a ‘wizard’ view or tabbed interface.

After all of the Probe values are collected from the user and set back on the Runner, the User Interface will then call the invoke() Runner method. This is where the actual code generation will happen. The invoke() method will raise an exception which contains error text from any failed Skeletons or Injectors. If it runs successfully, the return value will be a success string, much like the validate() method of a Probe.

The exact details here may change quite a bit over time as the Skeleton and Injector classes are fleshed out. There’s a good chance that the Runner will be passed a list of files that were changed, or other useful info beyond “code generated successfully”. However, the intent is the same: let the user know what’s going on.

More Details

I’m planning to use reStructuredText to handle markup in labels, help text, and other text passed to the user. It will probably happen automatically, but I may need to provide a way to prevent processing if it gets in the way.

Gettext will be used to set up translations of all core implementations. It may be rolled into the reStructuredText processor if that ends up being automatic. The idea is to make help text and translations a seamless part of the development process.

Interrogations and Runners will implement some sort of dictionary-like API to retrieve specific Probes. I see the use case here in a fancy web UI that validates via AJAX as the user enters values.

As I mentioned earlier, heavy use of setuptools entry points will be employed to register User Interfaces as console apps, and to provide a way for Runners to integrate with specific UIs. That said, there will be a set of stock implementations for user interfaces that should be compatible with any Runner.

Additional Products

In the crushinator.toolkit package, the aim is to provide useful implementations and helper libraries to make building code generators as easy as possible. To facilitate the implementations outlined above, a few specific utilities start to emerge:

  • Factory methods for Probes and Interrogations. It should be easy to make a Probe or Interrogation. The factory pattern makes sense here.
    NOTE: This might actually be best suited for crushinator.framework
    Some convenience features would include:
    • passing a callable to dynamically set the validate() method
    • generating a template variable name (‘short name’) for the  probe automatically based on the label
  • Implementation of validating Probes for many useful data types:
    • Integer
    • Float
    • E-mail address
    • Python identifier
    • URI
  • User Interface registration helper for Runners. There should be automatic registration for all of the stock UI implementations in crushinator.toolkit.
  • Gettext/reStructuredText parsing for labels, help, etc. Preferably one function that handles both tasks at once. This may be rolled into the base classes and become transparent to the developer.
  • Logging – I’ve already begun to implement logging in both the crushinator.framework and crushinator.toolkit packages. I want to make that as streamlined as possible for the developer. The idea is to swallow as many exceptions as possible, returning ‘friendly’ text to the user, but always log that something happened with as much detail as possible. I’m impressed with the way Python handles logging, and it’s ideal for systems like this.

Wrap Up

I’m preparing to implement things as I’ve outlined in this document. I’m interested in hearing what others think about this plan; is there anything I’ve omitted? Are there any assumptions I’m making that aren’t accurate? Have I overlooked any potential problems with user interface implementations you’re interested in seeing developed?

Comments, corrections, discussion, questions… really any feedback would be greatly appreciated.

Advertisements
This entry was posted in crushinator, python, rfc. Bookmark the permalink.

2 Responses to RFC: Crushinator Probe/Interrogation Implementation Details (branching questions, skipping, validation, etc)

  1. Would it be helpful to say “I don’t get it”? :-) That’s all I can muster at the moment. But I see you have put some time/effort into this so I’d hate for you to get zero feedback.

  2. I’m not sure what’s not to get. It’s pretty clearly stated:

    “The Crushinator project aims to replace PasteScript as a de facto standard for developing boilerplate code generation systems. My interest in starting the project came from use of and work on ZopeSkel/templer.core. There are several issues with the way that PasteScript solicits values from the user that are restrictive to the developer and often frustrating for the user.”

    Maybe more pointed RFC feedback might help. ZopeSkel’s dependency on PasteScript has taken ZopeSkel to the end of the road in what can be done to respond to user concerns about ZopeSkel. Crushinator is the way out and it’s got an extremely well thought out design. Josh is looking for comment/interest on that design.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s