MongoDB is a cross-platform, document-oriented database that works on the concept of collections and documents. It stores data in the form of key-value pairs and is a NoSQL database program. The term NoSQL means non-relational. Refer to MongoDB and Python for an in-depth introduction to the topic. Now let’s understand the use of distinct() function in PyMongo.
distinct()
PyMongo includes the distinct()
function that finds and returns the distinct values for a specified field across a single collection and returns the results in an array.
Syntax : distinct(key, filter = None, session = None, **kwargs)
Parameters :
- key : field name for which the distinct values need to be found.
- filter : (Optional) A query document that specifies the documents from which to retrieve the distinct values.
- session : (Optional) a ClientSession.
Let’s create a sample collection :
# importing the module from pymongo import MongoClient # creating a MongoClient object client = MongoClient() # connecting with the portnumber and host # accessing the database database = client[ 'database' ] # access collection of the database mycollection = mydatabase[ 'myTable' ] documents = [{ "_id" : 1 , "dept" : "A" , "item" : { "code" : "012" , "color" : "red" }, "sizes" : [ "S" , "L" ]}, { "_id" : 2 , "dept" : "A" , "item" : { "code" : "012" , "color" : "blue" }, "sizes" : [ "M" , "S" ]}, { "_id" : 3 , "dept" : "B" , "item" : { "code" : "101" , "color" : "blue" }, "sizes" : "L" }, { "_id" : 4 , "dept" : "A" , "item" : { "code" : "679" , "color" : "black" }, "sizes" : [ "M" ]}] mycollection.insert_many(documents) for doc in mycollection.find({}): print (doc) |
Output :
{'_id': 1, 'dept': 'A', 'item': {'code': '012', 'color': 'red'}, 'sizes': ['S', 'L']} {'_id': 2, 'dept': 'A', 'item': {'code': '012', 'color': 'blue'}, 'sizes': ['M', 'S']} {'_id': 3, 'dept': 'B', 'item': {'code': '101', 'color': 'blue'}, 'sizes': 'L'} {'_id': 4, 'dept': 'A', 'item': {'code': '679', 'color': 'black'}, 'sizes': ['M']}
Now we will; use the distinct()
method to :
- Return distinct values for a Field
- Return Distinct Values for an Embedded Field
- Return Distinct Values for an Array Field
- Return Specific Query
# distinct() function returns the distinct values for the # field dept from all documents in the mycollection collection print (mycollection.distinct( 'dept' )) # distinct values for the field color, # embedded in the field item, from all documents # in the mycollection collection print (mycollection.distinct( 'item.color' )) # returns the distinct values for the field sizes # from all documents in the mycollection collection print (mycollection.distinct( "sizes" )) # distinct values for the field code, # embedded in the field item, from the documents # in mycollection collection whose dept is equal to B. print (mycollection.distinct( "item.code" , { "dept" : "B" })) |
Output :
['A', 'B'] ['red', 'blue', 'black'] ['L', 'S', 'M'] ['101']