Troubleshoot high CPU usage in MongoDB

Analyze the requests being executed by the MongoDB database

1. Connect to the instance through Mongo Shell.

For details, see Mongo Shell connecting to a single node instance, Mongo Shell connecting to a replica set instance, and Mongo Shell connecting to a sharded cluster instance.

2. Execute the db.currentOp() command to view the operations currently being performed by the database.

An example output from this command is as follows.

{
        "desc" : "conn632530",
        "threadId" : "140298196924160",
        "connectionId" : 632530,
        "client" : "11.192.159.236:57052",
        "active" : true,
        "opid" : 1008837885,
        "secs_running" : 0,
        "microsecs_running" : NumberLong(70),
        "op" : "update",
        "ns" : "mygame.players",
        "query" : {
            "uid" : NumberLong(31577677)
        },
        "numYields" : 0,
        "locks" : {
            "Global" : "w",
            "Database" : "w",
            "Collection" : "w"
        },
        ....
    },

The following fields need to be focused on.

field Return value description
client Which client initiated the request.
opid A unique identifier for the operation. Note If necessary, you can directly terminate the operation through db.killOp(opid).
secs_running Indicates the time, in seconds, that the operation has been performed. If the value returned by this field is particularly large, you need to check whether the request is reasonable.
microsecs_running Indicates the time, in microseconds, that the operation has been performed. If the value returned by this field is particularly large, you need to check whether the request is reasonable.
ns The set of targets for this operation.
op Indicates the type of operation. Usually one of query, insert, update, delete.
locks For information related to locks, see Concurrency Introduction for details. This article does not describe in detail. Note For db.currentOp documentation, see db.currentOp.
View the operations being performed through db.currentOp(), and analyze whether there are abnormally time-consuming requests being performed. For example, the CPU usage of your business is not high at ordinary times. The operation and maintenance manager connects to the MongoDB database to perform some operations that require a full table scan, resulting in very high CPU usage and slow business response. At this time, you need to focus on the time-consuming execution time. operate.
NOTE If an abnormal request is found, you can find the opid corresponding to the request and execute db.killOp(opid) to terminate the request.

Analyze slow requests for MongoDB databases

Enter the specified database through the use command.

use mongodbtest

Run the following command to view the slow request log under the data.

db.system.profile.find().pretty()

Analyze slow request logs to find the cause of increased MongoDB CPU usage.

The following is an example of a slow request log. It can be seen that the request has performed a full table scan, scanned 11,000,000 documents, and did not query through the index

{
        "op" : "query",
        "ns" : "123.testCollection",
        "command" : {
                "find" : "testCollection",
                "filter" : {
                        "name" : "zhangsan"
                },
                "$db" : "123"
        },
        "keysExamined" : 0,
        "docsExamined" : 11000000,
        "cursorExhausted" : true,
        "numYield" : 85977,
        "nreturned" : 0,
        "locks" : {
                "Global" : {
                        "acquireCount" : {
                                "r" : NumberLong(85978)
                        }
                },
                "Database" : {
                        "acquireCount" : {
                                "r" : NumberLong(85978)
                        }
                },
                "Collection" : {
                        "acquireCount" : {
                                "r" : NumberLong(85978)
                        }
                }
        },
        "responseLength" : 232,
        "protocol" : "op_command",
        "millis" : 19428,
        "planSummary" : "COLLSCAN",
        "execStats" : {
                "stage" : "COLLSCAN",
                "filter" : {
                        "name" : {
                                "$eq" : "zhangsan"
                        }
                },
                "nReturned" : 0,
                "executionTimeMillisEstimate" : 18233,
                "works" : 11000002,
                "advanced" : 0,
                "needTime" : 11000001,
                "needYield" : 0,
                "saveState" : 85977,
                "restoreState" : 85977,
                "isEOF" : 1,
                "invalidates" : 0,
                "direction" : "forward",
....in"
                }
        ],
        "user" : "root@admin"
}

Usually in slow request logs, you need to focus on the following points.

full table scan (keyword: COLLSCAN, docsExamined )
Full set (table) scan COLLSCAN . 
When an operation request (such as query, update, delete, etc.) requires a full table scan, it will be very busy CPU resource. Found when looking at slow request logs COLLSCAN keywords, most likely these queries take CPU resource.

Note If such requests are frequent, it is recommended to build an index on the field to be queried for optimization.

by viewing docsExamined The value of , you can see how many documents were scanned by a query. The larger the value, the more CPU The higher the cost.
unreasonable index (keyword: IXSCAN,keysExamined )

illustrate
The more indexes, the better. Too many indexes will affect the performance of writing and updating.

If your application is write-biased, indexes can affect performance.
by viewing keysExamined field, you can see how many indexes are scanned for a query that uses an index. The larger the value, the CPU The higher the cost.
If the index is not established reasonably, or there are many matching results. In this way, even if indexes are used, the request overhead will not be optimized much, and the execution speed will be very slow.

As shown below, assuming a certain set of data, the value of the x field has a high repetition rate (assuming only 1 and 2), while the repetition rate of the y field value is very low.

{ x: 1, y: 1 }
{ x: 1, y: 2 }
{ x: 1, y: 3 }
......
{ x: 1, y: 100000} 
{ x: 2, y: 1 }
{ x: 2, y: 2 }
{ x: 2, y: 3 }
......
{ x: 1, y: 100000}
to achieve {x: 1, y: 2} such a query.

db.createIndex( {x: 1} ) does not work well, because x has too many identical values
db.createIndex( {x: 1, y: 1} ) does not work well, because x has too many identical values
db.createIndex( {y: 1 } ) works well because y has very few identical values
db.createIndex( {y: 1, x: 1 } ) works well, because y has the same value and less
For the difference between {y: 1} and {y: 1, x: 1}, please refer to the official documentation of MongoDB Index Principle and Compound Index.

Large amount of data sorting (keywords: SORT,hasSortStage )
When the query request contains a sort, system.profile in the collection hasSortStage field will be true . If the ordering cannot be satisfied by the index, MongoDB will be sorted in the query results. And sorting this action will be very expensive CPU resources, which requires optimizations in how frequently sorted fields are indexed.

Note When you find the SORT keyword in the system.profile collection, you can consider optimizing the sorting by index.
Other operations such as indexing, aggregation (a combination of actions such as traversal, query, update, sorting, etc.) may also consume a lot of CPU resources, but they are essentially the same as the above scenarios. For more profiling settings, please refer to the profiling official documentation.

Link: https://help.aliyun.com/document_detail/62224.html

Tags: Database

Posted by scarhand on Mon, 09 May 2022 07:46:34 +0300