登录查看更多内容

A Guide to Undo (Part 1 or 3)

Andre Milota

Dialog Engine Architect at Huawei Technologies Social robotics research lab

发布日期: 2023年4月21日

1 ? ? ? ? Abstract

In this series, I will describe some of my experiences implementing complex UNDO-REDO mechanisms. In this first installment, we will take a detailed look at the semantics of UNDO.?

In the second installment, I will cover basic implementation approaches.?
Finally in the third installment, I will discuss advanced techniques, including hybrids and integration with complex layered architectures.

3 ? ? ? ? Introduction

UNDO is a crucial element of many GUI-based applications as it allows users to make mistakes and experiment. In general, the semantics of the UNDO mechanism should be as consistent as possible across all applications running on a particular operating system. Therefore, there is little room for significant variation. Nevertheless, it can be quite challenging for a designer to provide an intuitive set of UNDO semantics for their particular program.

The Apple Play Store provides some guidance for developers in their ecosystem, but these are very general. Personally, I look to the most popular applications when trying to assess what users are most likely to expect in domain-specific situations. Among others, I turn to Microsoft Word, Visual Studio, CorelDRAW, and Avid’s Pro Tools for guidance.

4 ? ? ? ? Series Outline

This series of articles is primarily intended for developers. In this first of three articles, I will briefly review the common semantics used to implement UNDO and discuss some more obscure issues that the developer should consider when planning their implementation. This first article may also be of some use to UI designers or perhaps serve as a starting point for a discussion between the UI designer and the developer serving as the basis of a checklist of what needs to be defined. In the second article, I shall describe basic implementation methods, and then in the third article, I will describe more detailed software architecture and integration with complex modular and layered architectures.

5 ? ? ? ? UNDO vs. Time Travel and Versioning

In its simplest form, UNDO can be thought of as merely time travel. Under this metaphor all states are stored on a timeline. The user can navigate backward along this time line to see prior states. Each step takes us back one or more actions (see below). Unlike a log of states, new actions taken will create a new history going forward.

However, the user might? invoke UNDO too many times and get back to an undesired state. Or they may merely want to observe the prior state. Consequently many systems will also provide a REDO operation that acts as a reversal of the UNDO operation itself.?

We can also think of the timeline as a sequence of actions rather than states. As such, there is the possibility of going back to a particular time/state and changing the action sequence without truncating the history at that point. This may be highly desirable. For instance, one might want to alter an object before it was copied multiple times so all of the copies inherit the change. However changing an upstream action may invalidate later actions or make their interpretation ambiguous. For instance, if the user goes back before a rotate action and rotates the object a little more, should we interpret the now second rotation action as rotating the object by a particular angle or to a particular angle??

Trying to do this would make the interface much more complicated. While such systems have been explored, the standard interaction that GUI users expect and can use without a substantial learning curve, simply drops all future actions/states when the user goes back in the UNDO history and makes a change.

We can discuss such modifications of a sequence of actions with colleagues especially if we can draw things and/or mark on computer generated timelines. A multimodal system that includes natural language dialog visualizations and gesture input could easily address a more complex UNDO model. Researchers have also used visualizations, and version control systems use artificial language-based dialog for similar complicated temporal navigation. However, in a system with only two commands the simple standard UNDO/REDO model may be optimal. Even if it isn’t optimal, this is what your users will be expecting. More to the point, this series is about implementing this particular UNDO/REDO abstraction, rather than exploring other UIs.

6 ? ? ? ? Depth Limitations

In some systems, only a single UNDO operation may be provided; but in most modern applications the user is given an infinite stack into which all operations are pushed. Some applications limit the number of UNDOs that the system will store. After a certain number of? actions have been undone, it will not be possible to UNDO any more operations. This is generally used to limit memory use, but it may not be necessary in today's world, even when editing video. However, it may still be desired in some big data processing and preparation applications. In some architectures, the implementation of a limited UNDO feature might actually make the system more complex.

If we only have one level of UNDO we can map the undo key, to alternately UNDO and redo and we will not need an additional input path. Essentially the undo key flips between the current and the prior state. While some editors do this it is not the most common UI though. Your users may not appreciate this surprise.?

7 ? ? ? ? Invocation

The Xerox Start had a dedicated and labeled UNDO button for invoking UNDO as did the later Sun Microsystems and Apricot Systems computers. While these keys may have been slower to press than a home key and a modifier key, this had the major advantage in that developers and users alike knew what was expected and greater consistency could have been maintained. Instead, present Windows programs either use control-z or Alt+Backspace, and Ctrl+Y or Ctrl+Shift+Z for redo. This inconsistency can be extremely annoying particularly if the expected key combination has some other non-obvious function. Personally, I believe it would have been better to write the common control functions on the fronts of the letter keys to set the expectation for developers. Failing that, it might be worth allowing the user to employ any of these common accelerators to invoke these commands.

Many Windows applications, anyway, map the redo functionality to repeat the last action when the REDO stack runs empty. This seems like a reasonably intuitive mapping and hopefully your users will expect it if you implement your system this way. There are several additional design choices and issues to consider in implementing a “repeat last action key” but these are beyond the scope of this series.?

Most GUIs also provide widgets under the edit menu. This is helpful in that it can provide a keyboard shortcut hint. If the user does not know what the keyboard shortcut is, though I have seen many even experienced users who use the menu based UNDO option for some reason.

8 ? ? ? ? Document Vs Editor (What to undo)

The next question the designer may consider is what exactly should be altered by the UNDO and REDO operations? Generally we can break the data the system holds into the document and the view.? In my opinion, and from what I have seen of most applications, the state of the document should be managed by the UNDO system, but much of the state of the view should not.

At least some editions of Pro Tools have a very annoying exception where some actions on some parts of the document are not undoable. Specifically deleting a FX plug-in does not seem to be undoable. This seems like a betrayal of the user’s trust.?

In a system such as a text editor, we may want to track some, but not all parts of the view state. For example, when undoing the insertion of a letter, the cursor should be moved back to the location where the character was inserted. We could think of this as nearly invoking a delete operation, which has the side effect of moving the cursor.?

In Microsoft Word, the cursor can appear either at the end or the beginning of a line when it is located after the last character of that line, even when there is no carriage return at the end of the line. Thus sometimes when a cursor is between two characters it may be rendered in one of two locations. This is very definitely not a bug but a very helpful feature. Google docs does it as well and I would argue that it is something that any text editor that wraps its text must have especially given that users will expect it. MS Word remembers which of these locations the cursor was in and restores this position when the user invokes the UNDO operation.?

Alternately when undoing a cut operation it may be desirable to highlight the region that had been put back, marking it just as it was before it was cut, and helping the user see what was replaced.?

No alt text provided for this image — Selection range recovery

8.1 ? ? ? Clipboard

Almost all systems that I have encountered exclude the clipboard from UNDO and REDO actions. If the user copies something and then invokes UNDO or REDO, the document may be changed, but the information in the clipboard is unaffected. These semantics allow the user to UNDO back to a point before something important has been deleted. Once in this state, they can copy the item and then REDO to the original state and paste in the valuable item. Alternatively, they can copy something important before invoking a number of UNDO operations and then paste in the desired item.

Once, I encountered an application where the state of the clipboard was managed by the UNDO stack. Not being able to use this familiar and useful pattern was quite annoying, and work was lost.?

9 ? ? ? ? Local UNDO

As discussed earlier, it becomes tricky to delete actions in a random order. We can reduce the effects of this issue by decoupling different parts of the project into separate documents, each edited in their own instance of an editor, each of which has its own independent UNDO manager. For instance, a typical Integrated Development Environment (IDE) will partition the project into different files.

This can become problematic if the IDE provides things like refactoring tools that can change multiple files in a single action. I have not had to deal with this problem so I am not discussing it further. If you, however, are implementing a tool like this, I would urge you to carefully consider the desired semantics before implementing it.

Forms can also have localized UNDO. Each text box or even widget may maintain its own UNDO history. Your particular GUI toolkit may provide this capability, but you may want to provide a global undo for a form. Again, one should think at least twice before doing something different in a part of the UI where the user already has particular expectations.

9.1 ? ? ? Configuration

As stated earlier UNDO allows users to try things out and not be afraid of making mistakes. Some applications have extremely complex and confusing configuration menus comprising thousands of cryptically annotated controls deeply nested in impenetrable menu hierarchies. Perhaps it might be beneficial to allow the user to try out a setting and then undo some settings and navigate back to the menu with the obscure setting’s control widget. The stack for undoing the system configuration should probably not use the same stacks as the document.

Configuration UNDO will likely be seldom used but can prove to be extremely useful when it is. So perhaps it should be implemented using a much more complex GUI that can deliver more insightful information. In many cases, configurations consist of isolated text boxes or flags, and thus can be treated differently than a document that consists of an arbitrary set of elements with complex interrelationships. Hence the system for experimenting with configuration and recovering from errors might be very much different from traditional UNDO.

IDEs also have project configuration files that similarly have complex largely fixed structures. I have yet to see an UNDO for the project configuration, but I could see where this might be useful.

领英推荐

Latest Top 10 .Net 9 Blazor Features

Shailendra Chauhan 3 个月前

Issue #12 - VS Code: Customizing Settings and Adding…

Christina Truong 1 年前

AI is NOT coming for you, Server actions, Figma <3…

Builder.io 11 个月前

10? ? Granularity

In my very old copy of Microsoft Word (2007) and in Google Docs, pressing UNDO after typing a few words will cause what appears to be a rather arbitrary set of the last letters to be undone. It does not seem to follow any logic that I have been able to intuit so far although they do seem similar. They do not just remove the last word typed but often delete a sequence of letters that may break in the middle of the word. I suspect that it may be timing-related in which a sequence of letters quickly typed are chunked into what is presumed the user will think of as a single action. In notepad, hitting UNDO after typing a sequence of letters will delete all the consecutively typed letters. It does not seem to look at delays at all.? Hitting backspace to correct a typo will break the chunking in all three cases. We could potentially also try to inform this process by parsing the input. This would be particularly easy in an IDE where programming languages are being edited.

It might also be desirable to automatically turn off chunking selectively. For instance if the user hits UNDO and a chunk is undone and then hits REDO and that chunk is restored then hitting UNDO a second time should perhaps only undo one letter at a time till they have completely undone that chunk after which the next chunk will be undone.?

For example the user rapidly types:

The cat?

And they, they pause?

And then they type:?

“ran away”?

Yielding:?

The cat ran away

Then they hit UNDO and because of chunking they are left with.

The cat?

Than they press REDO?

And they get:

The cat ran away

If we were to use the suggested automatic chunking inhibition than pressing UNDO will result in this now:

The cat ran awa?

Rather than going back to:?

The cat

In a single step?

Once they had pressed undo and had gotten back to?

The cat

Pressing UNDO could delete the rest of the text.

Alternatively perhaps text that had been typed a while ago could revert to a different chunking regime. The user is unlikely to recall what sequence of letters were typed together minutes after they entered them and perhaps at that point an UNDO action should chunk on word boundaries.?

This is not to say that we would omit any actions from our history, just forget about time stamps when chunking.?

11? ? Persistence

In general, the UNDO and REDO stacks are cleared when the file is saved and closed. When reloaded it is not possible to redo actions from the earlier session. Some applications like Pro Tools will make periodic backups of the file so that the user can at least see these prior states. Of course a source control system lets the user save selected document snapshots.

As one can keep an application open for days on a phone without it being in focus this adds yet another dimension to consider when deciding when to dump the UNDO stacks.?

UNDO is perhaps regarded as largely working in concert with the user’s short term memory and persistent UNDO stacks are not seen as all that useful.

12? ? External Agents and Multiple Users

Many editors today can also do things for the user automatically. For instance, I misspelled / mistyped many of the longer words in this paper as I was rapidly hammering away on my keyboard. But I have programmed my word processor to automatically replace them with correctly spelled words. In MS word these auto corrects are pushed onto the UNDO stack along with my typing and editing. Hence if a word is incorrectly replaced by the system I can press UNDO to restore it to the way it was when I typed it. Essentially the auto correct assistant and I are alternately taking turns typing on the keyboard. The auto correct system is triggered when I hit space at the end of each word to check and respell each word as I go. MS Word seems to log the undoing of an auto correct so that it does not repeat a rejected auto correct on a particular instance of a word. I have not fully analyzed the semantics of Word. You may choose to base your system's behavior on some other application or devise your own but whatever you do I would urge the designer to consider what behavior will be optimal for your users rather than just leaving it to chance.?

13? ? Visualization and Sonification

In the Apple Play Store recommendations, they? inform the user about what UNDO will cause to be undone and once invoked, give feedback so the user knows what it has undone. This latter seems even more necessary as many Iphone apps employ a recognition-based shaking gesture to invoke UNDO, as opposed to pressing a button that provides a definitive tactile response.

In some GUI based apps the edit menu may display a description of the actions that will be invoked when UNDO and REDO are pressed. Hover-over Tool-tips could also be used to show what UNDO and REDO will do at a particular point in time. Even if there is not enough room to display such a description they can at least be grayed out when their respective stacks are empty. Perhaps the target of the undo could be called out in some way as part of a hover over tooltip, though this may not be worth the effort.?

As discussed earlier, undoing a delete operation should show the deleted item if it was selected before being deleted. (not just deleted with a backspace)

In some applications audio feedback of some kind might also be worth considering. This might just be in the form of a click when the REDO stack is empty or perhaps a voice could even summarize what was done, though this might only be appropriate for blind users or eyes free interfaces.

14? ? Conclusion?

UNDO/REDO is not as simple as it might appear, but our goal is to make it simple and intuitive for the user. While there may not be a great deal of room to innovate as user expectations are well established, there is still a need to plan carefully and try to anticipate users’ expectations for new situations that one’s innovative application might create.? One should also be careful to think through this aspect of the user experience in the design phase, lest one paint oneself into a corner during implementation. In the final installment, I hope to provide implementation patterns that can get you the implementer out of whatever corner you find yourself in, but still I would urge the interaction designer to develop a thorough plan for the interaction semantics first.?

In the next installment I will take a brief look at simple implementation methods and then in the final installment discuss engineering details.?

要查看或添加评论，请登录

Andre Milota的更多文章

The Potential Changing Role of Formal Languages in Chatbot Enhanced Development Workflows Part 1

2025年3月6日

The Potential Changing Role of Formal Languages in Chatbot Enhanced Development Workflows Part 1

Introduction LLM-based tools like Copilot, ChatGPT (when used for coding), and Devin are reshaping how programmers…
Multimodal Code Modification Tools With LLMs

2024年12月29日

Multimodal Code Modification Tools With LLMs

Introduction Are LLM-driven code tools the future of programming—or just overhyped bug generators? Some claim they’ll…
Challenges of building intelligent multimodal user interfaces using off the shelf LLMs (Part 1)

2024年9月3日

Challenges of building intelligent multimodal user interfaces using off the shelf LLMs (Part 1)

1 Introduction A multimodal user interface (MMUI) integrates verbal input (such as speech or text) with spatial input…
Two Phase Modality Fusion

2024年7月23日

Two Phase Modality Fusion

Introduction Multimodal systems, including user interfaces, robot controllers, or medical imaging machines, often reap…
Potential Advantages of a Speech and Gesture Multimodal User Interface for Integrated Development Environments

2024年7月17日

Potential Advantages of a Speech and Gesture Multimodal User Interface for Integrated Development Environments

Introduction In this series, I will discuss how an intelligent multimodal user interface could improve the usability of…
Rational Considerations in Choosing Speech Input VS Multimodal Input

2024年6月6日

Rational Considerations in Choosing Speech Input VS Multimodal Input

1 Introduction I would claim that many speech user interface (SUI) designers sabotage their own efforts by…
UNDO part II general implementation techniques

2023年5月12日

UNDO part II general implementation techniques

Introduction Last time I discussed the semantics of the standard UNDO mechanism found on most GUI based applications…
End user programming with an intelligent multimodal agent in a WIMP environment

2020年4月15日

End user programming with an intelligent multimodal agent in a WIMP environment

In the prior post I started describing my visions for using a multimodal intelligent agent for programming. In this…
Programming With the Artificial Computer Chauffeur

2020年4月9日

Programming With the Artificial Computer Chauffeur

Introduction In this series of posts I discuss some ideas on how to use speech input in various types of programming…

2 条评论
Turn taking in speech User Interfaces (part 3 of microphone control)

2020年3月31日

Turn taking in speech User Interfaces (part 3 of microphone control)

Introduction In the previous two posts, I discussed methods by which the user can tell a speech recognition-driven user…

See all articles

A Guide to Undo (Part 1 or 3)

Andre Milota

Dialog Engine Architect at Huawei Technologies Social robotics research lab

1 ? ? ? ? Abstract

3 ? ? ? ? Introduction

4 ? ? ? ? Series Outline

5 ? ? ? ? UNDO vs. Time Travel and Versioning

6 ? ? ? ? Depth Limitations

7 ? ? ? ? Invocation

8 ? ? ? ? Document Vs Editor (What to undo)

8.1 ? ? ? Clipboard

9 ? ? ? ? Local UNDO

9.1 ? ? ? Configuration

领英推荐

10? ? Granularity

11? ? Persistence

12? ? External Agents and Multiple Users

13? ? Visualization and Sonification

14? ? Conclusion?

Andre Milota的更多文章

社区洞察

其他会员也浏览了

Next.js 14+ Design Patterns

25 Essential Commands to get started with CSS (Cascading Style Sheet) ??

React.js + Builder: A game-changing combination

Jetpack Compose - quick qeview

Material UI (MUI) V5 Microfrontends: Sharing MUI themes with Module Federation

Kayte Lang Web implementation

The Evolution of CSS: Unveiling New Features and Expanding Capabilities

Unleashing High-Performance C# with Span<T>

How to make your Web API responses consistent and useful

1 ? ? ? ? Abstract

3 ? ? ? ? Introduction

4 ? ? ? ? Series Outline

5 ? ? ? ? UNDO vs. Time Travel and Versioning

6 ? ? ? ? Depth Limitations

7 ? ? ? ? Invocation

8 ? ? ? ? Document Vs Editor (What to undo)

8.1 ? ? ? Clipboard

9 ? ? ? ? Local UNDO

9.1 ? ? ? Configuration

领英推荐

10? ? Granularity

11? ? Persistence

12? ? External Agents and Multiple Users

13? ? Visualization and Sonification

14? ? Conclusion?

Andre Milota的更多文章

The Potential Changing Role of Formal Languages in Chatbot Enhanced Development Workflows Part 1

Multimodal Code Modification Tools With LLMs

Challenges of building intelligent multimodal user interfaces using off the shelf LLMs (Part 1)

Two Phase Modality Fusion

Potential Advantages of a Speech and Gesture Multimodal User Interface for Integrated Development Environments

Rational Considerations in Choosing Speech Input VS Multimodal Input

UNDO part II general implementation techniques

End user programming with an intelligent multimodal agent in a WIMP environment

Programming With the Artificial Computer Chauffeur

Turn taking in speech User Interfaces (part 3 of microphone control)

社区洞察

其他会员也浏览了

Next.js 14+ Design Patterns

25 Essential Commands to get started with CSS (Cascading Style Sheet) ??

React.js + Builder: A game-changing combination

Jetpack Compose - quick qeview

Material UI (MUI) V5 Microfrontends: Sharing MUI themes with Module Federation

Kayte Lang Web implementation

The Evolution of CSS: Unveiling New Features and Expanding Capabilities

Unleashing High-Performance C# with Span<T>

How to make your Web API responses consistent and useful