Analysis and comparison of crud operations performance of relational and NoSQL databases
Автор: Oganesyan A.
Журнал: Мировая наука @science-j
Рубрика: Естественные и технические науки
Статья в выпуске: 8 (77), 2023 года.
Бесплатный доступ
Despite the fact that NoSQL systems have existed for quite a long time, today there are relatively few studies on the topic of comparing their performance with relational systems. The available works often do not allow us to get a complete picture, because either they describe experiments of a narrow focus (for example, comparing time spent only on data insertion operations), or they have specialized and rarely used DBMS as research objects. As part of this work, it is proposed to consider the fairly well-known PostgreSQL and MongoDB.
Dbms, postgresql, mongodb, efficiency
Короткий адрес: https://sciup.org/140301302
IDR: 140301302
Текст научной статьи Analysis and comparison of crud operations performance of relational and NoSQL databases
To date, it is safe to say that the process of informatization concerns all spheres of human activity, which means that certain information repositories are used almost everywhere.
A relational database is a set of interconnected tables, each of which contains information about objects of a certain type. Each row of the table contains data about one object (for example, a car, a computer, a client), and the columns of the table contain various characteristics of these objects - attributes (for example, engine number, processor brand, phone numbers of companies or customers).
As part of this work, the "online store-warehouse" structure is organized for the PostgreSQL DBMS (figure 1).
Figure 1– ER database diagram for PostgreSQL
As for MongoDB, it is a non-relational database, which means it is impossible to build arbitrary queries based on the available data. This problem is solved, as a rule, in two ways. The first of them consists in designing collections in the manner of tables from relational databases. The connection itself is carried out within the framework of the application. The second method is related to data denormalization. By placing, for example, the t_address collection inside the t_user collection (while leaving a separate copy of the t_address table), you can provide the possibility of pre-organizing join requests for these entities. This approach, however, is associated with very serious difficulties in ensuring data consistency, because changes that have occurred with a specific record in one collection must occur in all copies. Thus, one should be extremely careful when implementing "pre-join" and take into account the difficulties associated with it when analyzing the experiments described in this paper.
A personal computer with the following characteristics was used as a workstation:
-
• Operating system: Windows 10;
-
• processor: Intel Core i7 2.6 GHz;
-
• RAM: 8 GB.
PostgreSQL version is 9.6.1, MongoDB version is 3.4.2. An Internet service was used to generate data [9]. Each experiment was conducted for 10000; 100,000; 500,000; 1 000,000; 2,000,000 and 5,000,000 records with calculation of the average execution time for thirty attempts. Calculations and construction of histograms were carried out in the Microsoft Excel software product. To measure the execution time in PostgreSQL, the /timing directive was used, in MongoDB, methods of profiling the operation log were used, using the explain() method where possible, as well as placing timestamps with subsequent calculation of the difference between their values.
It is worth noting that the MongoDB internal query analyzer has an accuracy of 1 ms, so when conducting experiments with a small amount of data, it will not be possible to get accurate information about the query execution time.
Insert

0,21700 1,99827 10,87564 25,40014 49,01154 148,13567
0,00378 1,11754 4,51324 11,98754 25,98654 75,11024
MongoDB
PostgreSQL
Amount of rows
Figure 2– Comparison of insertion operation execution time
In the experiment on inserting records, the t_item entity was used, storing rows of the form:
Table 1 – Example of a row from the t_item table.
item_id (integer) |
item_name varchar(30) |
item_model varchar(30) |
item_weight float |
item_price float |
item_desc text |
1 |
vitae consectetuer |
adipiscing |
51,886 |
9869,215 |
ante ipsum primis in faucibus |
The data update experiment was carried out within the t_address table, the zip numeric field (zip code) was changed[1,3]
Update

MongoDB 0,14897 2,01254 11,11547 22,11425 43,15248 112,59865
PostgreSQL 0,03010 0,49857 4,58762 14,01389 24,98564 64,15487
Amount of rows
Figure 3– Comparison of the execution time of update operations
Select with and without B-tree index (1% of

PostgreSQL (No index) 0,00198 0,01499 0,07211 0,18958 0,48117 1,20014
Amount of rows
Figure 4– Comparison of the execution time of sampling operations with an index and without using an index
For experiments with the sampling operation, the following scenario was used: the execution time of operations with the sampling condition for the same records with the presence of an index based on the binary search tree and without it was measured. The condition for the item_price field of the float type was used. One percent of the records satisfied the search expression[2]
The following are the results of experiments with the selection operation with table attachment (t_user and t_address by the user_address_id and address_id fields, respectively).
Join

MongoDB ($lookup) 0,36998 3,84412 17,89625 36,88365 75,15987 228,77214
MongoDB 0,00215 0,02698 0,11098 0,21996 0,50112 1,04985
PostgreSQL 0,04325 0,42987 2,24995 4,79898 11,75994 28,22934
Amount of rows
Figure 5– Comparison of the execution time of data attachment operations
As already mentioned above, today under the word
NoSQL is understood not by those DBMSs that are managed using a language that does not belong to the SQL standard, but rather by those that are not relational. It is all the more surprising to see that in the MongoDB version -3.2 – the developer company has provided the opportunity to organize connections by common fields of the table [1]. The connection command looks like this:
db.t_user.aggregate([{$lookup:{ from: "t_address", localField: "user_address_id", foreignField: "address_id", as: "find_address"}}]), where from is the name of the external collection, localField is the attachment field from the collection in question, foreignField is the attachment field from the external collection, as – alias is the name for the resulting attachment of records.
At the moment, it is possible to join only one field and only in the left outer join format. Thus, based on the existing limitations and time characteristics of the execution of this query, when organizing table joins in the selected NoSQL solution, other methods should be used.
Group by city

MongoDB 0,01598 0,10965 0,47985 1,16521 2,36547 6,14852
PostgreSQL 0,00698 0,034587 0,16001 0,29990 0,6112 1,58112
Amount of rows
Figure 6– Comparison of the execution time of data grouping operations
This experiment is based on a query calculating how many resource users represent a particular city (based on the t_address table; sort the result in descending order of the aggregating function)[4]
The last experiment in this chapter is related to finding the maximum value of the item_price field in the t_item table.
Max(item_price)

Figure 7–Comparison of the execution time of the maximum value search operations
Thus, experiments show the advantage of PostgreSQL in all tasks except indexed search. As for the joining operation, the decision on data denormalization for NoSQL DBMS directly depends on the specific task, taking into account the costs of maintaining consistency and storing redundant information.
List:
-
1. Release Notes for MongoDB 3.2 – MongoDB Manual 3.2 [Электронный ресурс]. URL: https://docs.mongodb.org/manual/release- notes/3.2/#aggregation- framework-enhancements
-
2. Rick Cattell. Scalable SQL and NoSQL data stores. ACM SIGMOD Record,Volume 39 Issue 4, December 2010. – NY: ACM New York. – p.12-27
-
3. Parker Z., Poe S., Vrbsky S. Comparing NoSQL MongoDB to an SQL DB, Proceedings of the 51st ACM Southeast Conference. – NY: ACM New York,2013. – 6 p.
-
4. Daniel J. Abadi, Peter A. Boncz, Stavros Harizopoulos. Column-oriented database systems. Proceedings of the VLDB Endowment, Volume 2 Issue 2, August 2009 (pages 1664-1665).
Список литературы Analysis and comparison of crud operations performance of relational and NoSQL databases
- Release Notes for Mongo B 3.2 - Mongo DB Manual 3.2 [Электронный ресурс]. URL: https://docs.mongodb.org/manual/release-notes/3.2/#aggregation- framework-enhancements.
- Rick Cattell. Scalable SQL and NoSQL data stores. ACM SIGMOD Record, Volume 39 Issue 4, December 2010. - NY: ACM New York. - p.12-27.
- Parker Z., Poe S., Vrbsky S.Comparing NoSQL MongoDB to an SQL DB, Proceedings of the 51st ACM Southeast Conference. - NY: ACM New York, 2013. - 6 p.
- Daniel J. Abadi, Peter A. Boncz, Stavros Harizopoulos. Column-oriented database systems. Proceedings of the VLDB Endowment, Volume 2 Issue 2, August 2009 (pages 1664-1665).