db-tutorial/docs/12.数据库/04.文档数据库/01.MongoDB/06.MongoDB建模示例.md

14 KiB
Raw Blame History

title date categories tags permalink
MongoDB 建模示例 2020-09-12 10:43:53
数据库
文档数据库
MongoDB
数据库
文档数据库
MongoDB
建模
/pages/88c7d3/

MongoDB 建模示例

关系型模型

嵌入式文档一对一关系模型

嵌入式文档一对一关系模型 - 嵌入式文档模式

// patron document
{
   _id: "joe",
   name: "Joe Bookreader"
}

// address document
{
   patron_id: "joe", // reference to patron document
   street: "123 Fake Street",
   city: "Faketon",
   state: "MA",
   zip: "12345"
}

合并为:

{
  "_id": "joe",
  "name": "Joe Bookreader",
  "address": {
    "street": "123 Fake Street",
    "city": "Faketon",
    "state": "MA",
    "zip": "12345"
  }
}

嵌入式文档一对一关系模型 - 子集模式

假设,有一个用于描述电影信息的 collection 定义:

{
  "_id": 1,
  "title": "The Arrival of a Train",
  "year": 1896,
  "runtime": 1,
  "released": ISODate("01-25-1896"),
  "poster": "http://ia.media-imdb.com/images/M/MV5BMjEyNDk5MDYzOV5BMl5BanBnXkFtZTgwNjIxMTEwMzE@._V1_SX300.jpg",
  "plot": "A group of people are standing in a straight line along the platform of a railway station, waiting for a train, which is seen coming at some distance. When the train stops at the platform, ...",
  "fullplot": "A group of people are standing in a straight line along the platform of a railway station, waiting for a train, which is seen coming at some distance. When the train stops at the platform, the line dissolves. The doors of the railway-cars open, and people on the platform help passengers to get off.",
  "lastupdated": ISODate("2015-08-15T10:06:53"),
  "type": "movie",
  "directors": ["Auguste Lumière", "Louis Lumière"],
  "imdb": {
    "rating": 7.3,
    "votes": 5043,
    "id": 12
  },
  "countries": ["France"],
  "genres": ["Documentary", "Short"],
  "tomatoes": {
    "viewer": {
      "rating": 3.7,
      "numReviews": 59
    },
    "lastUpdated": ISODate("2020-01-09T00:02:53")
  }
}

在应用中,有的场景只需要显示电影的简单浏览信息,不需要显示类似 fullplot、poster 这样的详细信息。因为,我们可以考虑将原结构一份为二,并通过 id 字段关联起来。

用于展示摘要信息的 movie collection

// movie collection

{
  "_id": 1,
  "title": "The Arrival of a Train",
  "year": 1896,
  "runtime": 1,
  "released": ISODate("1896-01-25"),
  "type": "movie",
  "directors": ["Auguste Lumière", "Louis Lumière"],
  "countries": ["France"],
  "genres": ["Documentary", "Short"]
}

用于展示细节信息的 movie_details collection

// movie_details collection

{
  "_id": 156,
  "movie_id": 1, // reference to the movie collection
  "poster": "http://ia.media-imdb.com/images/M/MV5BMjEyNDk5MDYzOV5BMl5BanBnXkFtZTgwNjIxMTEwMzE@._V1_SX300.jpg",
  "plot": "A group of people are standing in a straight line along the platform of a railway station, waiting for a train, which is seen coming at some distance. When the train stops at the platform, ...",
  "fullplot": "A group of people are standing in a straight line along the platform of a railway station, waiting for a train, which is seen coming at some distance. When the train stops at the platform, the line dissolves. The doors of the railway-cars open, and people on the platform help passengers to get off.",
  "lastupdated": ISODate("2015-08-15T10:06:53"),
  "imdb": {
    "rating": 7.3,
    "votes": 5043,
    "id": 12
  },
  "tomatoes": {
    "viewer": {
      "rating": 3.7,
      "numReviews": 59
    },
    "lastUpdated": ISODate("2020-01-29T00:02:53")
  }
}

嵌入式文档一对多关系模型

嵌入式文档一对多关系模型 - 嵌入式文档模式

// patron document
{
   _id: "joe",
   name: "Joe Bookreader"
}

// address documents
{
   patron_id: "joe", // reference to patron document
   street: "123 Fake Street",
   city: "Faketon",
   state: "MA",
   zip: "12345"
}

{
   patron_id: "joe",
   street: "1 Some Other Street",
   city: "Boston",
   state: "MA",
   zip: "12345"
}

合并为:

{
  "_id": "joe",
  "name": "Joe Bookreader",
  "addresses": [
    {
      "street": "123 Fake Street",
      "city": "Faketon",
      "state": "MA",
      "zip": "12345"
    },
    {
      "street": "1 Some Other Street",
      "city": "Boston",
      "state": "MA",
      "zip": "12345"
    }
  ]
}

嵌入式文档一对多关系模型 - 子集模式

考虑一个电商网站用于表示商品的 collection

{
  "_id": 1,
  "name": "Super Widget",
  "description": "This is the most useful item in your toolbox.",
  "price": { "value": NumberDecimal("119.99"), "currency": "USD" },
  "reviews": [
    {
      "review_id": 786,
      "review_author": "Kristina",
      "review_text": "This is indeed an amazing widget.",
      "published_date": ISODate("2019-02-18")
    },
    {
      "review_id": 785,
      "review_author": "Trina",
      "review_text": "Nice product. Slow shipping.",
      "published_date": ISODate("2019-02-17")
    },
    ...{
      "review_id": 1,
      "review_author": "Hans",
      "review_text": "Meh, it's okay.",
      "published_date": ISODate("2017-12-06")
    }
  ]
}

评论按时间倒序排列。 当用户访问产品页面时,应用程序将加载十条最近的评论。可以将集合分为两个集合,而不是与产品一起存储所有评论:

产品集合存储有关每个产品的信息,包括产品的十个最新评论:

{
  "_id": 1,
  "name": "Super Widget",
  "description": "This is the most useful item in your toolbox.",
  "price": { "value": NumberDecimal("119.99"), "currency": "USD" },
  "reviews": [
    {
      "review_id": 786,
      "review_author": "Kristina",
      "review_text": "This is indeed an amazing widget.",
      "published_date": ISODate("2019-02-18")
    }
    ...
    {
      "review_id": 776,
      "review_author": "Pablo",
      "review_text": "Amazing!",
      "published_date": ISODate("2019-02-16")
    }
  ]
}

review collection 存储所有的评论

{
  "review_id": 786,
  "product_id": 1,
  "review_author": "Kristina",
  "review_text": "This is indeed an amazing widget.",
  "published_date": ISODate("2019-02-18")
}
{
  "review_id": 785,
  "product_id": 1,
  "review_author": "Trina",
  "review_text": "Nice product. Slow shipping.",
  "published_date": ISODate("2019-02-17")
}
...
{
  "review_id": 1,
  "product_id": 1,
  "review_author": "Hans",
  "review_text": "Meh, it's okay.",
  "published_date": ISODate("2017-12-06")
}

引用式文档一对多关系模型

考虑以下映射出版商和书籍关系的示例。

该示例说明了引用式文档的优点,以避免重复发布者信息。

{
   title: "MongoDB: The Definitive Guide",
   author: [ "Kristina Chodorow", "Mike Dirolf" ],
   published_date: ISODate("2010-09-24"),
   pages: 216,
   language: "English",
   publisher: {
              name: "O'Reilly Media",
              founded: 1980,
              location: "CA"
            }
}

{
   title: "50 Tips and Tricks for MongoDB Developer",
   author: "Kristina Chodorow",
   published_date: ISODate("2011-05-06"),
   pages: 68,
   language: "English",
   publisher: {
              name: "O'Reilly Media",
              founded: 1980,
              location: "CA"
            }
}

为避免重复出版商数据,可以使用引用型文档,并将出版商信息与书本分开保存。 使用引用时,关系的增长决定了将引用存储在何处。 如果每个出版商的图书数量很少且增长有限,则有时将图书参考存储在出版商文档中可能会很有用。 否则,如果每个发布者的书籍数量不受限制,则此数据模型将导致可变的,不断增长的数组,如以下示例所示:

{
   name: "O'Reilly Media",
   founded: 1980,
   location: "CA",
   books: [123456789, 234567890, ...]
}

{
    _id: 123456789,
    title: "MongoDB: The Definitive Guide",
    author: [ "Kristina Chodorow", "Mike Dirolf" ],
    published_date: ISODate("2010-09-24"),
    pages: 216,
    language: "English"
}

{
   _id: 234567890,
   title: "50 Tips and Tricks for MongoDB Developer",
   author: "Kristina Chodorow",
   published_date: ISODate("2011-05-06"),
   pages: 68,
   language: "English"
}

为了避免可变的,增长的数组,请将发行者参考存储在书籍文档中:

{
   _id: "oreilly",
   name: "O'Reilly Media",
   founded: 1980,
   location: "CA"
}

{
   _id: 123456789,
   title: "MongoDB: The Definitive Guide",
   author: [ "Kristina Chodorow", "Mike Dirolf" ],
   published_date: ISODate("2010-09-24"),
   pages: 216,
   language: "English",
   publisher_id: "oreilly"
}

{
   _id: 234567890,
   title: "50 Tips and Tricks for MongoDB Developer",
   author: "Kristina Chodorow",
   published_date: ISODate("2011-05-06"),
   pages: 68,
   language: "English",
   publisher_id: "oreilly"
}

树形结构模型

img

具有父节点的树形结构模型

上图结构可以用父引用来表示:

db.categories.insertMany([
  { "_id": "MongoDB", "parent": "Databases" },
  { "_id": "dbm", "parent": "Databases" },
  { "_id": "Databases", "parent": "Programming" },
  { "_id": "Languages", "parent": "Programming" },
  { "_id": "Programming", "parent": "Books" },
  { "_id": "Books", "parent": null }
])
  • 检索节点的父节点:

    db.categories.findOne( { _id: "MongoDB" } ).parent
    
  • 可以在父字段上创建索引以启用父节点的快速搜索:

    db.categories.createIndex( { parent: 1 } )
    
  • 可以通过父字段查询找到其直接子节点:

    db.categories.find( { parent: "Databases" } )
    
  • 检索子树,可以参考: $graphLookup.

具有子节点的树形结构模型

db.categories.insertMany([
  { "_id": "MongoDB", "children": [] },
  { "_id": "dbm", "children": [] },
  { "_id": "Databases", "children": ["MongoDB", "dbm"] },
  { "_id": "Languages", "children": [] },
  { "_id": "Programming", "children": ["Databases", "Languages"] },
  { "_id": "Books", "children": ["Programming"] }
])
  • 检索节点的 children

    db.categories.findOne( { _id: "Databases" } ).children
    
  • 可以在 children 字段上创建索引以启用子节点的快速搜索:

    db.categories.createIndex( { children: 1 } )
    
  • 可以在 children 字段中查询节点,以找到其父节点及其兄弟节点:

    db.categories.find( { children: "MongoDB" } )
    

具有祖先的树形结构模型

db.categories.insertMany([
  {
    "_id": "MongoDB",
    "ancestors": ["Books", "Programming", "Databases"],
    "parent": "Databases"
  },
  {
    "_id": "dbm",
    "ancestors": ["Books", "Programming", "Databases"],
    "parent": "Databases"
  },
  {
    "_id": "Databases",
    "ancestors": ["Books", "Programming"],
    "parent": "Programming"
  },
  {
    "_id": "Languages",
    "ancestors": ["Books", "Programming"],
    "parent": "Programming"
  },
  { "_id": "Programming", "ancestors": ["Books"], "parent": "Books" },
  { "_id": "Books", "ancestors": [], "parent": null }
])
  • 检索节点的祖先或路径的查询是快速而直接的:

    db.categories.findOne({ "_id": "MongoDB" }).ancestors
    
  • 可以在 ancestors 字段上创建索引,以启用祖先节点的快速搜索:

    db.categories.createIndex({ "ancestors": 1 })
    
  • 可以通过 ancestors 字段查询查找其所有后代:

    db.categories.find({ "ancestors": "Programming" })
    

具有实体化路径的树形结构模型

db.categories.insertMany([
  { "_id": "Books", "path": null },
  { "_id": "Programming", "path": ",Books," },
  { "_id": "Databases", "path": ",Books,Programming," },
  { "_id": "Languages", "path": ",Books,Programming," },
  { "_id": "MongoDB", "path": ",Books,Programming,Databases," },
  { "_id": "dbm", "path": ",Books,Programming,Databases," }
])
  • 可以查询以检索整个树,并按字段路径排序:

    db.categories.find().sort( { path: 1 } )
    
  • 可以在 path 字段上使用正则表达式来查找 Programming 的后代

    db.categories.find( { path: /,Programming,/ } )
    
  • 可以检索 Books 的后代,其中 Books 也位于层次结构的最高级别:

    db.categories.find( { path: /^,Books,/ } )
    
  • 要在 path 字段上创建索引,请使用以下调用:

    db.categories.createIndex( { path: 1 } )
    

具有嵌套集的树形结构模型

img

db.categories.insertMany([
  { _id: 'Books', parent: 0, left: 1, right: 12 },
  { _id: 'Programming', parent: 'Books', left: 2, right: 11 },
  { _id: 'Languages', parent: 'Programming', left: 3, right: 4 },
  { _id: 'Databases', parent: 'Programming', left: 5, right: 10 },
  { _id: 'MongoDB', parent: 'Databases', left: 6, right: 7 },
  { _id: 'dbm', parent: 'Databases', left: 8, right: 9 }
])

可以查询以检索节点的后代:

var databaseCategory = db.categories.findOne({ _id: 'Databases' })
db.categories.find({
  left: { $gt: databaseCategory.left },
  right: { $lt: databaseCategory.right }
})

设计模式

大文档,很多列,很多索引

解决方案是:列转行

img

管理文档不同版本

MongoDB 文档格式非常灵活,势必会带来版本维护上的难度。

解决方案是:可以增加一个版本号字段

  • 快速过滤掉不需要升级的文档
  • 升级时,对不同版本的文档做不同处理

统计网页点击量

统计数据精确性要求并不是十分重要。

解决方案:用近似计算

每隔 10 次写一次:

{ "$inc": { "views": 1 } }

精确统计

解决方案:使用预聚合

参考资料