16-理解示例数据集

选择背景色：黄橙洋红淡粉水蓝草绿白色选择字体：宋体黑体微软雅黑楷体选择字体大小：小中大特恢复默认

5.4.1　理解示例数据集

创建示例数据集 本节有两个目的。一是让您实际编写一个MongoDB shell脚本，这种技能在本书后面很有用，因为大部分示例都是脚本。二是提供一个填充数据的脚本，本书后面将使用这些数据来执行数据库操作。在这个练习中，您将添加程序清单5.10所示的代码。这些大多是简单的JavaScript代码，只有最后三行例外，它们连接到数据库words，将集合word_stats的文档删除以清除所有数据，再将一个JavaScript对象数组插入到该集合中。您现在不用明白这些代码，本书后面将介绍它们。在这个练习中，您的重点是将这些代码输入文件generate_words.js中。请执行如下步骤，编写一个生成数据集的JavaScript文件，并使用命令来执行它，以创建并填充数据库words。 1．从本书的配套网站下载文件code/hour05/generate_words.js，并将其放到您的文件夹code/hour05中。程序清单5.10显示了这个文件中的代码，但请注意，为节省篇幅，对第3行进行了裁剪；在配套网站提供的文件中，第3行包含2500多个单词。您也可以使用自己的用逗号分隔的单词列表替换第3行的内容。 2．将这个文件存盘。 3．启动mongDB服务器。 4．再打开一个控制台窗口，并切换到目录code/hour05。 5．执行下面的命令以运行生成数据库words的脚本： 6．执行下面的命令，以核实确实创建了数据库words。文档数应超过2500。 程序清单5.10　连接到MongoDB服务器，执行初始化并填充数据库words的JavaScript文件 ▲

程序清单5.9显示了这个数据集中对象的结构。这个结构应该相当直观，这也是选择使用它的原因。这个文档结构包含如下类型的字段：字符串、整数、数组、子文档和子文档数组，让您能够测试各种查询、排序、分组和聚合方法。

字段word、first、last和size都是字符串或数字；字段letters是一个数组，包含单词中的字母（不重复）；字段stats是一个子文档，包含字段vowels和consonants；字段charsets是一个子文档数组，其中的子文档包含字段type（取值为consonants、vowels或other）和字段chars（由指定类型的字符组成的数组）。

程序清单5.9　Example Dataset Structure for a words Database

{
word: <word>,
first: <first_letter>,
last: <last_letter>,
size: <character_count>,
letters: [<array_of_characters_in_word_no_repeats>],
stats: {
  vowels:<vowel_count>, consonants:<consonant_count>},
charsets: [
  {
    "type": <consonants_vowels_other>,
    "chars": [<array_of_characters_of_type_in_word>]},
    ...
  ],
}

▼　Try It Yourself

mongo generate_words.js

mongo words --eval "db.word_stats.find().count()"

01 var vowelArr = "aeiou";
02 var consonantArr = "bcdfghjklmnpqrstvwxyz";
03 var words =
➥"the,be,and,of,to,it,I,can't,shouldn't,say,middle-class,apology,till";
04 var wordArr = words.split(",");
05 var wordObjArr = new Array();
06 for (var i=0; i<wordArr.length; i++){
07    try{
08       var word = wordArr[i].toLowerCase();
09       var vowelCnt = ("|"+word+"|").split(/[aeiou]/i).length-1;
10       var consonantCnt =
➥ ("|"+word+"|").split(/[bcdfghjklmnpqrstvwxyz]/i).length-1;
11       var letters = [];
12       var vowels = [];
13       var consonants = [];
14       var other = [];
15       for (var j=0; j<word.length; j++){
16          var ch = word[j];
17          if (letters.indexOf(ch) === -1){
18             letters.push(ch);
19          }
20          if (vowelArr.indexOf(ch) !== -1){
21             if(vowels.indexOf(ch) === -1){
22                vowels.push(ch);
23             }
24          }else if (consonantArr.indexOf(ch) !== -1){
25             if(consonants.indexOf(ch) === -1){
26                consonants.push(ch);
27             }
28          }else{
29             if(other.indexOf(ch) === -1){
30                other.push(ch);
31             }
32          }
33       }
34       var charsets = [];
35       if(consonants.length){
36          charsets.push({type:"consonants", chars:consonants});
37       }
38       if(vowels.length){
39          charsets.push({type:"vowels", chars:vowels});
40       }
41       if(other.length){
42          charsets.push({type:"other", chars:other});
43       }
44       var wordObj = {
45          word: word,
46          first: word[0],
47          last: word[word.length-1],
48          size: word.length,
49          letters: letters,
50          stats: { vowels: vowelCnt, consonants: consonantCnt },
51          charsets: charsets
52       };
53       if(other.length){
54          wordObj.otherChars = other;
55       }
56       wordObjArr.push(wordObj);
57    } catch (e){
58       console.log(e);
59       console.log(word);
60    }
61 }
62 db = connect("localhost/words");
63 db.word_stats.remove({});
64 db.word_stats.ensureIndex({word: 1}, {unique: true});
65 db.word_stats.insert(wordObjArr);

Previous 上一篇本节目录 Next 下一篇

16-理解示例数据集

5.4.1 理解示例数据集

5.4.1　理解示例数据集