In-depth rich text editor

Editor introduction

Common rich text editors can be divided into two categories, which are implemented by textarea and contenteditable.

textarea

The structure is simple and easy to use. Some text formats and complex styles are difficult to implement. It is recommended to be used only in scenarios with low editing requirements.

contenteditable

When an element's contenteditable property is set to true, the element becomes the body of the editor. Most functions can be achieved with document.execCommand, and mainstream editors are designed based on contenteditable.

However, relying solely on the content editor to directly produce html will bring some problems. For example, the output of the same input in different browsers may be inconsistent, and the same output may be displayed differently in different browsers, and these problems will be magnified on the mobile side. At the same time, the use of html has limitations and is not convenient for cross platform use

Therefore, a better solution is to formulate a set of data structure + document model, and all inputs are generated by the editor to produce agreed products, so that they can be parsed on different platforms and ensure the expected results.

There is another type of editor based on Google docs, which does not use contenteditable, but is based on canvas rendering [1], and simulates the operation of the editor by monitoring user input. This kind of editor is extremely expensive and complicated to implement.

This article takes quill[2] as an example to introduce how to implement a rich text editor that supports cross-platform rendering and can insert custom modules.

basic concept

delta[3]

A data structure used to describe rich text content or content transformation. It is in pure json format and can be converted into js objects for easy operation. The basic format is as follows, consisting of a set of ops.

op is a js object, which can be understood as a change to the current content. It mainly has the following attributes.

insert: Insert, later [3.2 Data Structure] introduces possible values ‚Äč‚Äčand corresponding meanings

retain: The value is of type number, retaining the content of the corresponding length

delete: the value is of type number, delete the content of the corresponding length

The above three properties must have and only one of them appears in the op object

attributes: optional, the value is an object, which can describe formatting information

How to understand content or content transformation, for example ūüĆį, the following data represents the content "Grass the Green",

{
  ops: [
    { insert: 'Grass', attributes: { bold: true } },
    { insert: ' the ' },
    { insert: 'Green', attributes: { color: '#00ff00' } }
  ]
}

After the next delta content transformation, the new content is "Grass the blue".

{
  ops: [
    //Next 5 characters unbold and italicize
    { retain: 5, attributes: { bold: null, italic: true } },
    //Keep 5 characters unchanged
    { retain: 5 },
    //insert
    { insert: "Blue", attributes: { color: '#0000ff' },
    //delete the next 5 characters
    { delete: 5 }
  ]
}

Delta is essentially a series of operation records, which can be seen as a process of recording from blank to target document when rendering, and HTML is a tree structure, so the linear structure of Delta has a natural advantage in business use compared to HTML.

parchment[4]

A document model, consisting of blots, used to describe data that can be extended with custom data.

<p>
    A text plus video rich text content.
    <img src="xxx" alt="">
  </p>
  <p>
    <strong>Bold end of text.</strong>
</p>

The relationship between parchment and blot is similar to that of DOM and element node. The above html content is described by dom tree and parchment tree as shown in the figure below.

 

parchment provides several basic blots, and supports the development of your own blots according to your needs. We will demonstrate how to develop a custom blot later.

{
  //basic node
  ShadowBlot,
  //container node => base node
  ContainerBlot,
  //format node => container node
  FormatBlot,
  //leaf node
  LeafBlot,
  //Editor root node => Container node
  ScrollBlot,
  //Block level node => Format node
  BlockBlot,
  //Inline Node => Format Node
  InlineBlot,
  //text node => leaf node
  TextBlot,
  //Embedded node => Leaf node
  EmbedBlot,
}

Finally, use a diagram to understand the internal workflow of quill. The business layer logic that developers need to focus on is very simple. You can change the editor content through manual input and api methods. At the same time, the editor change event will output the delta data corresponding to the current operation and the latest content

 

practical application

data flow

In the business, the basic data flow should be as shown in the figure below. The editor generates delta data, and then the parser of the corresponding platform renders the corresponding content.

 

data structure

Good content data structure design plays a key role in subsequent maintenance and cross platform rendering. We can put the media (picture, v id eo, custom format) data that rich text content depends on to the outer layer, so that it will be more consistent to expand and render in the future

interface ItemContent {
    //Rich text data, storing delta-string s
    text?: string;
    //  video
    videoList?: Video[];
    //  picture
    imageList?: Image[];
     //Custom modules such as polls, advertising cards, questionnaire cards, etc.
    customList?: Custom[];
}

The editor output is standard delta data, the structure is as follows,

//plain text, \n stands for newline
{
    insert: string;
},
 //special type of text
{
¬†¬†¬†¬†insert:¬†'hyperlink text'ÔľĆ
    attributes: {
        //text color
        color: string,
        //bold
        bold:  boolean,
        //hyperlink address
        link: string;
        ...,
    }
},
//ordered unordered list
{
    insert:  '\n',
    attributes: {
      list: 'ordered' | 'bullet'
    }
 },
{
    insert: {
        uploading: {
            //  Resource Type
            type: 'image' | 'video' | 'vote' | 'and more...'
            //resource id
            uid: string
        },
    },
},
//  picture
{
    insert: { image: '${image_uri}' }
},
//  video
{
    insert: {
        videoPoster: {
           /** Video cover address*/            url: string;
           /** video id*/            videoId: string;
        }
    }
},
//vote
{
    insert: {
        vote: {
            voteId: string
        }
    }
},
//Indent, all text in the scope is indented to the right by indent units;
//Scope: Backtracking from the current starting position, ending in any of the following situations
//1. Plain text \n
//2. The attribute of attributes contains indent and the indent value is less than or equal to the current value
{
    insert:  '\n',
    attributes: {
        indent: 1-8,
    }
},

Image/Video shuffling

Image uploading needs to support the display of the uploading state, and should not block the user's editing, so you need to use a placeholder element first, and replace the placeholder with a real image or video after the upload is complete.

custom blot

The benefit of custom blots is the ability to encapsulate entire functions (such as charting functions) into a single blot that can be used directly in business development, regardless of how each function is implemented. The following takes the image and video uploading state placeholder blot as an example to demonstrate how to customize a blot.

import Quill from 'quill';

enum MediaType {
  Image = 'image',
  Video = 'video',
}

interface UploadingType {
  type: MediaType;
  //Unique id, when the image or video is uploaded, you need to find the corresponding uid to replace
  uid: string;
}

export const BlockEmbed = Quill.import('blots/block/embed');

class Uploading extends BlockEmbed {
  static _value: Record<string, UploadingType> = {};

  static create(value: UploadingType) {
    const ELEMENT_SIZE = 60;
    //The dom node corresponding to the blot
    const node = super.create();
    this._value[value.uid] = value;
    node.contentEditable = false;
    node.style.width = `${ELEMENT_SIZE}px`;
    node.style.height = `${ELEMENT_SIZE}px`;
    node.style.backgroundImage = `url(placeholder address)`;
    node.style.backgroundSize = 'cover';
    node.style.margin = '0 auto';
    //Used to distinguish the corresponding resources
    node.setAttribute('data-uid', value.uid);
    return node;
  }

  static value(v) {
    return this._value[v.dataset?.uid];
  }
}

Uploading.blotName = 'uploading';
Uploading.tagName = 'div';

export default Uploading;

Register the custom blot to the editor instance and use quill's insertEmbed to call the blot.

// editor.tsx
Quill.register(VideoPosterBlot);

quill.insertEmbed(1, 'uploading', {
  type: 'image',
  uid: 'xxx',
});

Handling paste operations

Copy and paste can greatly improve the efficiency of the editor, but we need to perform special processing on the videos and pictures in the clipboard, convert the content in the clipboard into a custom format, and upload the pictures and videos automatically.

Fundamental

Monitor the user's paste operation, read the clipboardData[6] data returned by paste event[5], and insert it into the editor after secondary processing.

target.addEventListener('paste', (event) =>  {
    const clipboardData = (event.clipboardData || window.clipboardData)
    const text = clipboardData.getData(
      'text',
    );
    const html = clipboardData.getData(
      'text/html',
    );
    
    /**
    * Business logic
    */
    
    event.preventDefault();
});

clipboardData.items is an array collection of DataTransferItem, which contains the data content of this paste operation.

DataTransferItem has two properties, kind and type, where the value of kind is usually of type string, if it is file-type data, the value is file; the value of type is the MIME type, commonly text/plain and text/html.

process images

The image sources in the clipboard are divided into two categories. One is to copy directly from the file system. In this case, we

copy from file system

After copying and pasting from the file system, you can get the File object, then insert it directly into the editor to reuse the previous image upload logic.

Copy from webpage

It is easy to see from the above figure on the right that the content copied from the web page contains text/html rich text type. Since the picture may be a temporary address, it is not reliable to directly use the third-party picture address, so you need to extract the picture address from the html. After uploading to our own server, the picture uploading module can continue to use the above picture shuffling

The basic structure of the dom tree of the above content is shown in the figure. All nodes can be processed into an array structure through post-order traversal. When the node is a picture, the above picture shuffling logic is called.

convert({ html, text }, formats = {}) {
    if (!html) {
      return new Delta().insert(text || '');
    }
    //Returns an HTMLDocument object
    const doc = new DOMParser().parseFromString(html, 'text/html');
    const container = doc.body;
    // key - node
    // value - matcher: (node, delta, scroll) => newDelta
    const nodeMatches = new WeakMap();
    //Returns two matchers, processing ELEMENT_NODE and TEXT_NODE respectively, and converting dom to Delta
    const [elementMatchers, textMatchers] = this.prepareMatching(
      container,
      nodeMatches,
    );
    
    return traverse(
      this.quill.scroll,
      container,
      elementMatchers,
      textMatchers,
      nodeMatches,
    );
}


 function traverse(scroll, node, elementMatchers, textMatchers, nodeMatches) {
  //The node is a leaf node or text
  if (node.nodeType === node.TEXT_NODE) {
    return textMatchers.reduce((delta, matcher) =>  {
      return matcher(node, delta, scroll);
    }, new Delta());
  }
  if (node.nodeType === node.ELEMENT_NODE) {
    return Array.from(node.childNodes || []).reduce((delta, childNode) =>  {
      let childrenDelta = traverse(
        scroll,
        childNode,
        elementMatchers,
        textMatchers,
        nodeMatches,
      );
      if (childNode.nodeType === node.ELEMENT_NODE) {
        childrenDelta = elementMatchers.reduce((reducedDelta, matcher) =>  {
          return matcher(childNode, reducedDelta, scroll);
        }, childrenDelta);
        childrenDelta = (nodeMatches.get(childNode) || []).reduce(
          (reducedDelta, matcher) =>  {
            return matcher(childNode, reducedDelta, scroll);
          },
          childrenDelta,
        );
      }
      return delta.concat(childrenDelta);
    }, new Delta());
  }
  return new Delta();
}

The data in the above example can be converted into the following delta data. The processing method of video is similar to that of pictures, and will not be repeated here.

{
    ops: [
    {
        insert: 'Speaking of the name Ai Dongmei, young people today may not be very familiar with it, but she used to be a household name.'
    },
    {
        insert: 'Ai Dongmei is a famous marathon runner in my country'  ,         attribute: {
            bold: true
        },
    },
    {
        insert: '. Born in 1981, she is a girl from the Northeast. Like many ordinary post-80s, she comes from an ordinary family and has lived a very happy life since she was a child. Although her family is not rich, Ai Dongmei is still the apple of her parents' palm.'
    },
    {
       insert: {
           image: {
               url: 'xxx'
           }
       }
    },
    {
        insert: 'But what makes Ai Dongmei different from others is that she has shown an amazing talent for long-distance running since she was a child'  ,         attribute: {
            bold: true
        },
    },
    {
¬†¬†¬†¬†¬†¬†¬†¬†insert:¬†'¬†ÔľĆ 1993 At that time, Ai Dongmei was still in elementary school. She won a very good result in a running competition. She broke the local 3000-meter event record when her toe was injured, far exceeding all the participants. This shocked many people, so Ai Dongmei was successfully selected by Qiqihar Sports School.'
    }
   ]
}

Analytical data

In the web scene, you can use the quick delta to html [7] library for parsing. If it is an applet, it is relatively unfriendly to support media elements (for example, the width and height of an image in an applet must be specified [8]). You need to parse it yourself. Here is a brief introduction to how to render delta data

Since delta is a linear structure, when converting to dom, a tree needs to be built to associate the child elements of the block-level element with its children.

The original data in the above figure has undergone the first round of processing

  1. Denormalize plain text, convert abc\ndef\ng format to [abc, \n, def, \n, g]

  2. Write the meta information of block-level elements into the first op

The meta information of block-level elements includes: indentation, sequence number of the ordered list, the start and end index of [the block-level element where the current element is located] in the original data, and the index of the [block-level element where the current element is located] in the dom list

After the above conversion, the original data becomes the format in the above figure. Each op contains corresponding metadata. The next thing to do is to parse these ops and convert them into Element s.

For the rendering of custom blot, we can encapsulate it into components (react or vue components, depending on what framework you use), so that business functions and editor development can be decoupled, and students who do not understand the editor code can also participate in the development.

summary

So far, we have seen the basic process of developing an editor and some things to focus on. If you need to expand some functional cards in the business, such as various applications of Feishu documents, you can achieve it by expanding blot + writing corresponding components. In addition, it is also possible to easily realize cross-platform rendering of content by writing a parser for the corresponding platform to display in non-web scenarios.

Tags: Java programming language

Posted by jcornett on Mon, 19 Sep 2022 21:14:08 +0300