The speed is increased hundreds of times, and the application of the data structure in practical work is recorded once

During this period of time, I wrote a lot of source code analysis. This article wants to change the taste and share with you a case I encountered in my work. After all, as a part-time worker, in addition to looking at the source code at work, bricks still have to be moved. This article will share a story of using appropriate data structures to optimize performance, which greatly improves responsiveness by hundreds of times.

Here's the thing, I'm currently working for a foreign company, and we have an offline APP for foreigners to sell goods, such as clothes, shoes, Coke and so on. One day, the product manager came to me and made a request: I need to support three-tier product options. When I heard this demand, my first reaction was that I didn't seem to have seen three-tier product options. After all, I am also a senior shopkeeper for more than ten years. The general product options seem to have at most two layers. For example, the following is an e-commerce APP Options for a typical shoe:

This shoe is a two-tier product option, one is color and the other is size. There are a total of 11 colors and 11 sizes. In order to verify my intuition, I opened all the shopping apps on my phone, including Taobao, JD.com, Pinduoduo, and Suning.com. Among the products I've seen, I haven't found a product with three-tier options, at most two.

The sample code that can be run in this article has been uploaded to GitHub, and you can take it down and play: https://github.com/dennis-jiang/Front-End-Knowledges/tree/master/Examples/DataStructureAndAlgorithm/OptimizeVariations

Why is no one doing the three-tier option?

If one or two families do not do this, it may be that the needs of each family are different, but if everyone does not do it, it feels that things are not right. After careful analysis, I think there may be two reasons for not doing the three-tier option:

1. This may be a false demand

The shoe above is available in 11 colors and 11 sizes, which means that these options are followed by 11*11, a total of 121 items. If there is another third-level option, assuming that the third level also has 11 options, the corresponding products are 11 * 11 * 11 in total, that is, 1331 products. Many stores may not have 1331 products in total. In other words, the third-level option may be a pseudo-demand. Users do not have so many options on the third level. Let’s take the above shoes as an example. In addition to the color and size, if you have to add another level, it can only be Gender, that is, men's shoes and women's shoes. For men's shoes and women's shoes, the version and size are very different. Generally, they are not placed under one product. The more common practice is to divide them into two products, each with its own color and size.

2. Has performance issues

It is not difficult to just add the function of the third-level option, that is, to display a few more buttons that can be clicked. The click logic is not much different from the two-level option. But after thinking about it, I found that he has potential performance problems. Taking the above pair of shoes as an example, the data I get from the back-end API is like this:

const merchandise = {
  // variations store all options
  variations: [
    {
      name: 'color',
      values: [
        { name: 'Limited Edition 574 Navy Blue' },
        { name: 'Limited Edition 574 White Powder' },
        // 9 more below
      ]
    },
    {
      name: 'size',
      values: [
        { name: '38' },
        { name: '39' },
        // 9 more below
      ]
    },
  ],
  // The products array stores all the products
  products: [
    {
      id: 1,
      price: 208,
      // The correspondence with the above variations is in the variationMappings of each product
      variationMappings: [
        { name: 'color', value: 'Limited Edition 574 White Powder' },
        { name: 'size', value: '38'},
      ]
    },
    // There are more than 100 products below
  ]
}

The above structure itself is quite clear. merchandise.variations is an array with several layers of options. This array has several objects. The name of each object is the name of the current level, and the values ​​are the options contained in the current level. Therefore, merchandise .variations can be directly displayed on the UI, and they can be rendered into buttons hierarchically.

In the above picture, the user selects the limited edition 574 white powder on the first layer, and the non-existent products such as 40 and 41 on the second layer are automatically grayed out. The above data structure can be used to achieve this function. When the user selects the limited edition 574 white powder, we traverse the array of merchants.products. One item of this array is a product. The variationMappings on this product will have the current product. Color and size information. For my current project, if this product can be sold, it will be in the array of merchantdise.products. If it cannot be sold, there will be no such product in this array at all. For example, the limited edition 574 white powder in the picture above, the combination of 40 yards will not appear in merchantdise.products. If you can't find this combination when you search, it will turn it into gray and cannot be clicked.

So for the limited edition 574 white powder, 40 shoe, in order to know if it needs to be grayed out, I need to iterate through the array of merchants.products. According to the 11 colors and 11 sizes mentioned above, there will be a maximum of 121 products, that is, a maximum of 121 searches. Similarly, you need to know the limited edition 574 white powder, whether the product 41 can be sold, and the entire product array needs to be traversed. For 11 sizes, the entire product array needs to be traversed 11 times. For the two-layer option, 11 * 11 is already considered a lot, and there may not be serious performance problems with hundreds of operations per size. But if you add another layer of options, and if there are 11 options in this new layer, the complexity will increase by an index instantly, from $O(n^2)$ to $O(n^3)$! Now our total number of products is 11 * 11 * 11, which is 1331 products. If the third layer is gender, now in order to know whether the limited edition 574 white powder, 40, and male products can be sold, I need to traverse 1331 products, if It takes 20ms to traverse 121 items, which is relatively smooth. Then it takes 220ms to traverse 1331 items. This can obviously feel stuck. On some devices with poor hardware, this stuck will be more serious and become impossible. Accepted. Moreover, the technology used in our APP is React Native, and its performance is worse than the native one. In this way, users may uninstall it in anger!

I approached the product manager with the above questions about requirements and concerns about performance, and the following dialogue occurred:

Me: Boss, I found that there seems to be no APP on the market that supports the three-tier option. Is there a problem with this requirement? Compared with the two-tier option, the complexity of the three-tier option increases exponentially, and there may be performance issues. , the user will be stuck when using it.

Product Manager: Brother, what you are looking at are all domestic apps, but ours is for foreigners, and foreigners are used to it. If we want to sell it, we have to meet their needs. It's too stuck. I'll try to solve the performance problem. This is to add a few buttons to the UI. The design drawings are the same as before. I'll give you two days.

me! ? Forehead. . . Oh. . .

We don’t know a few foreigners, and we don’t dare to ask any more. They all say that it is the needs of users. We must meet the needs of the products before we can sell them. After the products are sold, we can only eat. Find a way to solve it!

solution

It seems that this requirement must be done. This function is not complicated, because the three-tier option can continue to traverse the commodity array by using the two-tier option, but the complexity increases exponentially, that is, from $O(n ^2)$ becomes $O(n^3)$, and users will get stuck when using it. Now, I need to think about whether there are other options to improve performance. After thinking about it, I found that this exponential increase in complexity comes from traversing our entire array, and if I could find a way not to traverse the array, I would immediately find the limited edition 574 white powder, 40, male counterpart Goods exist or not.

To convert this specific problem, it is actually: in an array, through a specific filter condition, find an item that meets the condition. Well, search sounds familiar. Now the reason why I need to traverse this array is because there is no direct correspondence between these search conditions and commodities. If I can establish a direct correspondence, then I can do it at once. did you find it? I thought: lookup trees. If I reorganize these hierarchies and organize them into a tree, where each item corresponds to a leaf node on the tree, I can reduce the search complexity of the three-level option from $O(n^3)$ to $O (1)$.

two-level search tree

To illustrate this algorithm, let me simplify the problem first. Suppose we now have two layers of options, color and size, and each layer of options has two options:

  1. Color: white, red
  2. Size: 39, 40

We now have 4 products corresponding to:

  1. Product No. 1: productId is 1, white, size 39
  2. Product No. 2: productId is 2, white, size 40
  3. Product No. 3: productId is 3, red, size 39
  4. Product No. 4: productId is 4, red, size 40

If we follow the simplest method, in order to find out whether the red shoes of size 39 exist, we need to traverse all the four products, and the time complexity at this time is $O(n^2)$. But if we build a tree like the following, we can reduce the time complexity to $O(1)$:

In the above tree, we ignore the root node. In this case, it is not useful. It is just the entrance of a tree. The first layer of light yellow nodes in this tree is our first layer option color, and the second layer is light blue. The color node is our second-level option size, but each color node will correspond to all sizes, so that our last second-level leaf node actually corresponds to a specific product. Now we want to find the red shoe size 39, just look at the node pointed by the red arrow in the picture to see if there is any product.

How should this data structure be represented in JS? In fact, it is very simple, an object will do, like this:

const tree = {
  "color: White": {
    "Size: 39": { productId: 1 },
    "Size: 40": { productId: 2 }
  },
  "color: red": {
    "Size: 39": { productId: 3 },
    "Size: 40": { productId: 4 }
  }
}

With the above data structure, we need to find the red 39 yards and directly take the value tree["color: red"]["size: 39"], and this complexity instantly becomes $O(1)$.

three-level search tree

After understanding the two-layer search tree above, it is simple to expand it to three layers, just add another layer directly. If our third-level option is gender now, and there are two options male and female, then our search tree looks like this:

The corresponding JS object:

const tree = {
  "color: White": {
    "Size: 39": { 
        "Sex: Male": { productId: 1 },
      "Gender: Female": { productId: 2 },
    },
    "Size: 40": { 
        "Sex: Male": { productId: 3 },
      "Gender: Female": { productId: 4 },
    }
  },
  "color: red": {
    "Size: 39": { 
        "Sex: Male": { productId: 5 },
      "Gender: Female": { productId: 6 },
    },
    "Size: 40": { 
        "Sex: Male": { productId: 7 },
      "Gender: Female": { productId: 8 },
    }
  }
}

Similarly, if we want to find a white, size 39, men's shoe, just tree["color: white"]["size: 39"]["gender: male"], this time complexity is also $ O(1)$.

write the code

The above algorithms have been figured out, and the rest is to write the code. The main code we need to write is to use the data returned by the API to build a structure like the above tree, which can be done in one traversal. For example, the structure returned by the API corresponding to the three-level search tree above is as follows:

const merchandise = {
  variations: [
    {
      name: 'color',
      values: [
        { name: 'White' },
        { name: 'red' },
      ]
    },
    {
      name: 'size',
      values: [
        { name: '39' },
        { name: '40' },
      ]
    },
    {
      name: 'gender',
      values: [
        { name: 'male' },
        { name: 'Female' },
      ]
    },
  ],
  products: [
    {
      id: 1,
      variationMappings: [
        { name: 'color', value: 'White' },
        { name: 'size', value: '39' },
        { name: 'gender', value: 'male' }
      ]
    }
    // There are 7 more products below, I will not repeat them
  ]
}

In order to convert the data returned by the API into our tree-structured data we write a method:

function buildTree(apiData) {
  const tree = {};
  const { variations, products } = apiData;

  // First use variations to build the tree structure, and the default value of leaf nodes is null
  addNode(tree, 0);
  function addNode(root, deep) {
    const variationName = variations[deep].name;
    const variationValues = variations[deep].values;

    for (let i = 0; i < variationValues.length; i++) {
      const nodeName = `${variationName}: ${variationValues[i].name}`;
      if (deep === 2) {
        root[nodeName] = null
      } else {
        root[nodeName] = {};
        addNode(root[nodeName], deep + 1);
      }
    }
  }

  // Then traverse the products once and fill in the values ​​for the leaf nodes of the tree
  for (let i = 0; i < products.length; i++) {
    const product = products[i];
    const { variationMappings } = product;
    const level1Name = `${variationMappings[0].name}: ${variationMappings[0].value}`;
    const level2Name = `${variationMappings[1].name}: ${variationMappings[1].value}`;
    const level3Name = `${variationMappings[2].name}: ${variationMappings[2].value}`;
    tree[level1Name][level2Name][level3Name] = product;
  }

  // Finally return the constructed tree
  return tree;
}

Then run the above API test data to see the effect, and find that the constructed tree fully meets our expectations:

Is this all right?

Now we have a search tree, when the user selects red, after 40 yards, in order to know whether the corresponding man can click, we don't need to traverse all the products, but can directly get the value from this structure. But is this done? not at all! Take a closer look at the data structure we built. The hierarchical relationship is fixed. The first layer is color, the second layer is size, the third layer is gender, and the corresponding products are placed on the third layer of gender. That is to say, to use this structure, the user must strictly follow, first choose the color, then choose the size, and then we will see which gender should be grayed out. If he doesn't follow this order, for example, he chooses gender male first, and then chooses size 40, then we should calculate which colors of the last layer should be grayed out. But using the above structure we can't figure it out, because we don't have the object tree["gender: male"]["size: 40"].

What to do about this? We don't have a gender-size-color tree, so let's build one! This is of course a method, but the user may have other operation sequences. If we want to cover all the possible operation sequences of the user, how many trees are needed in total? This is actually a full arrangement of the three variables of gender, size, and color, which is $A_3^3$, a total of 6 trees. A lazy person like me, let me build 6 trees, I'm really too lazy to do it. If we don't build so many trees and the demand cannot be covered, what should we do? Is there any way to be lazy? If I can do something about the requirements, can I avoid this problem? With this idea in mind, I thought of two things:

1. Give a default value.

When the user opens the product details page, the first available product is selected by default. This is equivalent to helping the user select a value in the order of color-size-gender from the beginning, and giving him a default order of operations.

2. No cancel function, only switch options

If the cancel function is provided, he cancels the default option of color-size-gender we provide, and can choose gender-size-color again. There is no cancel function, you can only switch by selecting other options, you can only change from red to white, but you cannot cancel red, the others are the same. In this way, we can always guarantee the order of color-size-gender. The user operation is only that the selected value of each level is different, the level order will not change, and our search tree will always be valid. And I found that some shopping sites can't cancel the option either, don't know if they have a similar problem.

Making these two changes to the requirements will not have much impact on the user experience. After discussing with the product manager, she agreed. In this way, I killed another 5 trees from the demand, and the laziness succeeded!

Here's what the three-tier option looks like in action:

And one more thing

In the previous solution, we solved the performance problem of search, but introduced a new problem, which is the need to create this search tree. Creating this search tree still requires a traversal of the product list, which is unavoidable. For a smoother user experience, we should try to hide this creation process where the user cannot perceive it. I have integrated it into the loading state of the product details page here. The user clicks to enter the product details page, and we go to the API to get data. There will inevitably be a loading state, which will turn around and so on. I also did this traversal process in this circle. When the API data is returned and the search tree is created, the circle will not end. This will theoretically prolong the time of the circle, but the local traversal will be faster than the network request, so the user perception is not obvious. When the circle is over, all the data are ready, and the user operation is the complexity of $O(1)$, which is really silky smooth~

Why not let the backend create this tree?

The above scheme is to create this tree on the front end. Is it possible that the data returned by the back end is like this at the beginning, I can just use it directly, so that I can be lazy again~ I really went to the back end, but He said to me: "I want to be lazy too!" Just kidding, the truth is, this commodity API is a microservice maintained by another team, and the data they provide is not only used by my terminal APP, but also used by other companies in the company. The product is used, so changing the return structure involves too much and cannot be changed at all.

package code

In fact, the implementation of our solution is relatively independent. If others use it, he doesn't care whether it is a tree or a grass in you. As long as you pass in the selection conditions, you can return the correct product, so we can encapsulate it as a class.

class VariationSearchMap {
    constructor(apiData) {
        this.tree = this.buildTree(apiData);
    }

      // This is how the previous tree was constructed
    buildTree(apiData) {
        const tree = {};
        const { variations, products } = apiData;

        // First use variations to build the tree structure, and the default value of leaf nodes is null
        addNode(tree, 0);
        function addNode(root, deep) {
            const variationName = variations[deep].name;
            const variationValues = variations[deep].values;

            for (let i = 0; i < variationValues.length; i++) {
                const nodeName = `${variationName}: ${variationValues[i].name}`;
                if (deep === variations.length - 1) {
                    root[nodeName] = null;
                } else {
                    root[nodeName] = {};
                    addNode(root[nodeName], deep + 1);
                }
            }
        }

        // Then traverse the products once and fill in the values ​​for the leaf nodes of the tree
        for (let i = 0; i < products.length; i++) {
            const product = products[i];
            const { variationMappings } = product;
            const level1Name = `${variationMappings[0].name}: ${variationMappings[0].value}`;
            const level2Name = `${variationMappings[1].name}: ${variationMappings[1].value}`;
            const level3Name = `${variationMappings[2].name}: ${variationMappings[2].value}`;
            tree[level1Name][level2Name][level3Name] = product;
        }

        // Finally return the constructed tree
        return tree;
    }

    // Add a method to search for items with the same parameter structure as variationMappings for API data
    findProductByVariationMappings(variationMappings) {
        const level1Name = `${variationMappings[0].name}: ${variationMappings[0].value}`;
        const level2Name = `${variationMappings[1].name}: ${variationMappings[1].value}`;
        const level3Name = `${variationMappings[2].name}: ${variationMappings[2].value}`;

        const product = this.tree[level1Name][level2Name][level3Name];

        return product;
    }
}

Then just use new when you use it:

const variationSearchMap = new VariationSearchMap(apiData);    // new an instance out

// Then you can use this instance to search
const searchCriteria = [
    { name: 'color', value: 'red' },
    { name: 'size', value: '40' },
    { name: 'gender', value: 'Female' }
];
const matchedProduct = variationSearchMap.findProductByVariationMappings(searchCriteria);
console.log('matchedProduct', matchedProduct);    // { productId: 8 }

Summarize

This article describes a requirement that I actually encountered in my work, and shares my implementation and optimization ideas for your reference. My implementation plan is not necessarily perfect. If you have a better solution, please discuss in the comment area~

The sample code that can be run in this article has been uploaded to GitHub, and you can take it down and play: https://github.com/dennis-jiang/Front-End-Knowledges/tree/master/Examples/DataStructureAndAlgorithm/OptimizeVariations

Let’s review the main points of this article:

  1. The requirement to be fulfilled in this article is a three-tier option for a commodity.
  2. When the user selects two tiers, the third tier option should automatically figure out what can and cannot be sold.
  3. Since there is no direct correspondence between the options returned by the backend API and the products, in order to find out whether they can be sold or not, we need to traverse all the products.
  4. When the total number of items is low, all item traversal may not cause significant performance issues.
  5. But when the options increase to three tiers, and the number of items increases exponentially, performance issues become apparent.
  6. For performance problems such as $O(n^3)$ that can be foreseen when writing code, we don't have to wait for bugs to be reported, but solve them directly during development.
  7. This example is to solve a search problem, so I thought of building a tree to directly reduce the complexity of $O(n^3)$ to $O(1)$.
  8. However, one tree cannot cover all user operations. To cover all user operations, six trees are required.
  9. For the purpose of being lazy, I discussed with the product manager, adjusted the requirements and interactions and cut down 5 trees. The real reason is that there are too many trees, which will take up more memory space and are not easy to maintain. Sometimes appropriate adjustment of requirements and interactions can also achieve the effect of optimizing performance. Performance optimization can combine interaction and technology to think.
  10. The search module of this tree can be individually encapsulated into a class, external users do not need to know the details, just call the interface to search directly.
  11. The front-end meeting point data structure is still useful, and it is still necessary in this scenario.

At the end of the article, thank you for spending your precious time reading this article. If this article has given you a little help or inspiration, please don’t be stingy with your likes and GitHub stars. Your support is the driving force for the author to continue to create.

Author blog post GitHub project address: https://github.com/dennis-jiang/Front-End-Knowledges

I also set up a public account [the big front end of the attack], no advertising, no hydrology, only high-quality originals, welcome to pay attention~

Tags: Front-end data structure Optimize

Posted by ready2drum on Fri, 06 May 2022 07:42:09 +0300