PHC7065 CRITICAL SKILLS IN DATA MANIPULATION FOR POPULATION SCIENCE

NoSQL Databases 2

Hui Hu Ph.D.

Department of Epidemiology

College of Public Health and Health Professions & College of Medicine

February 19, 2018

Aggregation

 

Join

 

Lab: MongoDB 2

find()

SELECT

  • MongoDB is a collection of documents
db.collection.find(<query filter>,<projection>).<cursor modifier>

Similar to FROM, specifies the collection to use

Similar to WHERE, specifies which documents to return

Projection variables in SELECT

How many results to return

Counting and Distinct

db.Students2.count('age')
db.Students2.count({'age':{'$exists':True}})
SELECT COUNT(age) FROM Students2
SELECT COUNT(DISTINCT age) FROM Students2
len(db.Students2.distinct('age'))

Aggregation

Aggregation

db.Students2.aggregate([
    {'$group':{'_id':'$department','meanAge':{'$avg':'$age'}}}
])
SELECT MEAN(age) FROM Students2 GROUP BY department;
SELECT MEAN(age) FROM Students2 
WHERE course0='a' OR course1='a' OR course2='a' 
GROUP BY department;
db.Students2.aggregate([
    {'$match':{'courses':{'$in':['a']}}}
    {'$group':{'_id':'$department','meanAge':{'$avg':'$age'}}}
])

Multi-attribute Aggregation

db.Students2.aggregate([
    {
     '$group':{
        '_id':{'department':'$department','college':'$college'},
        'meanAge':{'$avg':'$age'}
        }
    }
])
SELECT MEAN(age) FROM Students2 GROUP BY department, college;

Aggregation with Text Search

db.Students2.aggregate([
    {'$match':{'$text':{'$search':'Miami Florida'}}},
    {'$sort':{'score':{'$meta':'textScore'}}},
    {'$project':{'hometown':1,'name':1}}
])

Join

Join

Colleges

Students2

db.Colleges.aggregate([
    {
     '$lookup':{
        'from':'Students2',
        'localField':'name',
        'foreignField':'college',
        'as':'students'
      }
    }
])
db.Students2.aggregate([
    {
     '$lookup':{
        'from':'Colleges',
        'localField':'college',
        'foreignField':'name',
        'as':'colleges'
      }
    }
])

MongoDB 2

git pull

PHC7065-Spring2018-Lecture6

By Hui Hu

PHC7065-Spring2018-Lecture6

Slides for Lecture 6, Spring 2018, PHC7065 Critical Skills in Data Manipulation for Population Science

  • 415
Loading comments...

More from Hui Hu