131. Can I use index
keys to constrain query matches?
You can use the min() and max() methods to constrain the
results of the cursor returned from find() by using index keys.
132. Using $ne and
$nin in a query is slow. Why?
The $ne and $nin operators are not selective. If you need to
use these, it is often best to make sure that an additional, more selective
criterion is part of the query.
133. Can I use a
multi-key index to support a query for a whole array?
Not entirely. The index can partially support these queries
because it can speed the selection of the first element of the array; however,
comparing all subsequent items in the array cannot use the index and must scan
the documents individually.
134. How can I
effectively use indexes strategy for attribute lookups?
For simple attribute lookups that don’t require sorted
result sets or range queries, consider creating a field that contains an array
of documents where each document has a field (e.g. attrib ) that holds a
specific type of attribute. You can index this attrib field.
For example, the attrib field in the following document
allows you to add an unlimited number of attributes types:
{ _id : ObjectId(...),
attrib : [
{ k: "color", v: "red" },
{ k: "shape": v: "rectangle" },
{ k: "color": v: "blue" },
{ k: "avail": v: true }
]
}
Both of the following queries could use the same {
"attrib.k": 1, "attrib.v": 1 } index:
db.mycollection.find( { attrib: { $elemMatch : { k:
"color", v: "blue" } } } )
db.mycollection.find( { attrib: { $elemMatch : { k:
"avail", v: true } } } )
135. Where can I find
information about a mongod process that stopped running unexpectedly?
If mongod shuts down unexpectedly on a UNIX or UNIX-based
platform, and if mongod fails to log a shutdown or error message, then check
your system logs for messages pertaining to MongoDB. For example, for logs
located in /var/log/messages, use the following commands:
sudo grep mongod /var/log/messages
sudo grep score /var/log/messages
136. Does TCP
keepalive time affect sharded clusters and replica sets?
If you experience socket errors between members of a sharded
cluster or replica set, that do not have other reasonable causes, check the TCP
keep alive value, which Linux systems store as the tcp_keepalive_time value. A
common keep alive period is 7200 seconds (2 hours); however, different
distributions and OS X may have different settings. For MongoDB, you will have
better experiences with shorter keepalive periods, on the order of 300 seconds
(five minutes).
On Linux systems you can use the following operation to
check the value of tcp_keepalive_time:
cat /proc/sys/net/ipv4/tcp_keepalive_time
You can change the tcp_keepalive_time value with the
following operation:
echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time
The new tcp_keepalive_time value takes effect without
requiring you to restart the mongod or mongos servers. When you reboot or
restart your system you will need to set the new tcp_keepalive_time value, or
see your operating system’s documentation for setting the TCP keepalive value
persistently.
For OS X systems, issue the following command to view the
keep alive setting:
sysctl net.inet.tcp.keepinit
To set a shorter keep alive period use the following
invocation:
sysctl -w net.inet.tcp.keepinit=300
If your replica set or sharded cluster experiences
keepalive-related issues, you must alter the tcp_keepalive_time value on all
machines hosting MongoDB processes. This includes all machines hosting mongos
or mongod servers.
Windows users should consider the Windows Server Technet
Article on KeepAliveTime configuration30 for more
information on setting keep alive for MongoDB deployments on
Windows systems.
137. What tools are
available for monitoring MongoDB?
The MongoDB Management Services <http://mms.mongodb.com>
includes monitoring. MMS Monitoring is a free, hosted services for monitoring
MongoDB deployments. A full list of third-party tools is available as part of
the
Monitoring for MongoDB documentation.
138. Do I need to
configure swap space?
Always configure systems to have swap space. Without swap,
your system may not be reliant in some situations with extreme memory
constraints, memory leaks, or multiple programs using the same memory. Think of
the swap space as something like a steam release valve that allows the system
to release extra pressure without affecting the overall functioning of the
system.
Nevertheless, systems running MongoDB do not need swap for
routine operation. Database files are memory-mapped and should constitute most
of your MongoDB memory use. Therefore, it is unlikely that mongod will ever use
any swap space in normal operation. The operating system will release memory
from the memory mapped files without needing swap and MongoDB can write data to
the data files without needing the swap system.
139. What is “working
set” and how can I estimate its size?
The working set for a MongoDB database is the portion of
your data that clients access most often. You can estimate size of the working
set, using the workingSet document in the output of serverStatus. To return
serverStatus with the workingSet document, issue a command in the following form:
db.runCommand( { serverStatus: 1, workingSet: 1 } )
140. Must my working
set size fit RAM?
Your working set should stay in memory to achieve good
performance. Otherwise many random disk IO’s will occur, and unless you are
using SSD, this can be quite slow.
One area to watch specifically in managing the size of your
working set is index access patterns. If you are inserting into indexes at
random locations (as would happen with id’s that are randomly generated by
hashes), you will continually
be updating the whole index. If instead you are able to
create your id’s in approximately ascending order (for example, day
concatenated with a random id), all the updates will occur at the right side of
the b-tree and the working set size for index pages will be much smaller.
It is fine if databases and thus virtual size are much
larger than RAM.
141. How do I
calculate how much RAM I need for my application?
The amount of RAM you need depends on several factors,
including but not limited to:
• The relationship between database storage and working set.
• The operating system’s cache strategy for LRU (Least
Recently Used)
• The impact of journaling
• The number or rate of page faults and other MMS gauges to
detect when you need more RAM
• Each database connection thread will need up to 1 MB of
RAM.
MongoDB defers to the operating system when loading data
into memory from disk. It simply memory maps all its data files and relies on
the operating system to cache data. The OS typically evicts the leastrecently-
used data from RAM when it runs low on memory. For example if clients access
indexes more frequently than documents, then indexes will more likely stay in
RAM, but it depends on your particular usage.
To calculate how much RAM you need, you must calculate your
working set size, or the portion of your data that clients use most often. This
depends on your access patterns, what indexes you have, and the size of your
documents.
Because MongoDB uses a thread per connection model, each
database connection also will need up to 1MB of RAM, whether active or idle.
If page faults are infrequent, your working set fits in RAM.
If fault rates rise higher than that, you risk performance degradation. This is
less critical with SSD drives than with spinning disks.
142. How do I read
memory statistics in the UNIX top command
Because mongod uses memory-mapped files, the memory
statistics in top require interpretation in a special way. On a large database,
VSIZE (virtual bytes) tends to be the size of the entire database. If the mongod
doesn’t have other processes running, RSIZE (resident bytes) is the total
memory of the machine, as this counts file system cache contents.
For Linux systems, use the vmstat command to help determine
how the system uses memory. On OS X systems use vm_stat.
143. What are the
factors to successful shared cluster.
The two most important factors in maintaining a successful
sharded cluster are:
• choosing an appropriate shard key and
• sufficient capacity to support current and future
operations.
You can prevent most issues encountered with sharding by
ensuring that you choose the best possible shard key for your deployment and
ensure that you are always adding additional capacity to your cluster well
before the current
resources become saturated.
144. In a new sharded
cluster, why does all data remains on one shard?
Your cluster must have sufficient data for sharding to make
sense. Sharding works by migrating chunks between the shards until each shard
has roughly the same number of chunks.
The default chunk size is 64 megabytes. MongoDB will not
begin migrations until the imbalance of chunks in the cluster exceeds the
migration threshold. While the default chunk size is configurable with the
chunkSize setting, these behaviors help prevent unnecessary chunk migrations,
which can degrade the performance of your cluster as a whole.
If you have just deployed a sharded cluster, make sure that
you have enough data to make sharding effective. If you do not have sufficient
data to create more than eight 64 megabyte chunks, then all data will remain on
one shard. Either lower the chunk size
setting, or add more data to the cluster.
As a related problem, the system will split chunks only on
inserts or updates, which means that if you configure sharding and do not continue to issue insert
and update operations, the database will not create any chunks. You can either
wait until your application inserts data or split chunks manually.
Finally, if your shard key has a low cardinality, MongoDB
may not be able to create sufficient splits among the data.
145. Why would one
shard receive a disproportion amount of traffic in a sharded cluster?
In some situations, a single shard or a subset of the
cluster will receive a disproportionate portion of the traffic and workload. In
almost all cases this is the result of a shard key that does not effectively
allow write scaling.
It’s also possible that you have “hot chunks.” In this case,
you may be able to solve the problem by splitting and then migrating parts of
these chunks.
In the worst case, you may have to consider re-sharding your
data and choosing a different shard key to correct this pattern.
146. What can prevent
a sharded cluster from balancing?
If you have just deployed your sharded cluster, you may want
to consider the troubleshooting suggestions for a new cluster where data
remains on a single shard.
If the cluster was initially balanced, but later developed
an uneven distribution of data, consider the following possible causes:
• You have deleted or removed a significant amount of data
from the cluster. If you have added additional data, it may have a different
distribution with regards to its shard key.
• Your shard key has low cardinality and MongoDB cannot
split the chunks any further.
Your data set is growing faster than the balancer can
distribute data around the cluster. This is uncommon and typically is the
result of:
– a balancing window that is too short, given the rate of
data growth.
– an uneven distribution of write operations that requires
more data migration. You may have to choose a different shard key to resolve
this issue.
– poor network connectivity between shards, which may lead
to chunk migrations that take too long to complete. Investigate your network
configuration and interconnections between shards.
147. Why do chunk
migrations affect sharded cluster performance?
If migrations impact your cluster or application’s
performance, consider the following options, depending on the nature of the
impact:
1. If migrations only interrupt your clusters sporadically,
you can limit the balancing window to prevent balancing activity during peak
hours. Ensure that there is enough time remaining to keep the data from
becoming out of balance again.
2. If the balancer is always migrating chunks to the
detriment of overall cluster performance:
• You may want to attempt decreasing the chunk size to limit
the size of the migration.
• Your cluster may be over capacity, and you may want to
attempt to add one or two shards to the cluster to distribute load.
It’s also possible that your shard key causes your
application to direct all writes to a single shard. This kind of activity
pattern can require the balancer to migrate most data soon after writing it.
Consider redeploying your cluster with a shard key that provides better write
scaling.
148.
What is default location of Mongo DB for data files and log files?
The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb
by default, and runs using the mongod user account. You can specify
alternate log and data file directories in /etc/mongodb.conf.
If you change the user that runs the MongoDB process, you
must modify the access control rights to the /var/lib/mongo and
/var/log/mongodb directories to give this users access to these directories.
149. Install MongoDB Enterprise on Red Hat Enterprise or CentOS
Packages
MongoDB
provides packages of the officially supported MongoDB Enterprise builds in it’s
own repository. This repository provides the MongoDB Enterprise distribution in
the following packages:
•
mongodb-enterprise --This package is a metapackage that will automatically
install the four component packages listed below.
•
mongodb-enterprise-server-- This package contains the mongod daemon and
associated configuration and init scripts.
•
mongodb-enterprise-mongos -- This package contains the mongos daemon.
•
mongodb-enterprise-shell-- This package contains the mongo shell.
•
mongodb-enterprise-tools -- This package contains the following MongoDB tools:
mongoimport bsondump, mongodump, mongoexport, mongofiles, mongoimport,
mongooplog, mongoperf, mongorestore, mongostat, and mongotop.
Control Scripts
The
mongodb-enterprise package includes various control scripts, including the init
script /etc/rc.d/init.d/mongod.
The package
configures MongoDB using the /etc/mongod.conf file in conjunction with the
control scripts.
As of version
2.6.4, there are no control scripts for mongos. The mongos process is used only
in sharding.
You can use the
mongod init script to derive your own mongos control script.
Considerations
MongoDB only
provides Enterprise packages for Red Hat Enterprise Linux and CentOS Linux
versions 5 and 6,64-bit.
The default
/etc/mongodb.conf configuration file supplied by the 2.6 series packages has
bind_ip‘ set to 127.0.0.1 by default. Modify this setting as needed for your
environment before initializing a replica set.
Changed in
version 2.6: The package structure and names have changed as of version 2.6..
Install MongoDB Enterprise
When you
install the packages for MongoDB Enterprise, you choose whether to install the
current release or a previous one. This
procedure describes how to do both.
Step 1:
Configure repository. Create an /etc/yum.repos.d/mongodb-enterprise.repo file
so that you can install MongoDB enterprise directly, using yum.
Use the
following repository file to specify the latest stable release of MongoDB
enterprise.
[mongodb-enterprise]
name=MongoDB Enterprise Repository
baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/stable/$basearch/
gpgcheck=0
enabled=1
Use the
following repository to install only versions of MongoDB for the 2.6 release.
If you’d like to install MongoDB Enterprise packages from a particular release
series, such as 2.4 or 2.6, you can specify the release series in the
repository configuration. For example, to restrict your system to the 2.6
release series, create a /etc/yum.repos.d/mongodb-enterprise-2.6.repo file to
hold the following configuration information for the MongoDB Enterprise 2.6
repository:
[mongodb-enterprise-2.6]
name=MongoDB Enterprise 2.6 Repository
baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/2.6/$basearch/
gpgcheck=0
enabled=1
.repo files for
each release can also be found in the repository itself. Remember that
odd-numbered minor release versions (e.g. 2.5) are development versions and are
unsuitable for production deployment.
Step 1: Install the MongoDB Enterprise
packages and associated tools. You can install either the latest stable version
of MongoDB Enterprise or a specific version of MongoDB Enterprise.
Install the
latest stable version of MongoDB Enterprise. Issue the following command:
sudo yum install -y mongodb-enterprise
Step 2: Optional. Manage Installed Version
Install a
specific release of MongoDB Enterprise. Specify each component package
individually and append the version number to the package name, as in the
following example that installs the 2.6.1 release of MongoDB:
sudo yum
install -y mongodb-enterprise-2.6.1 mongodb-enterprise-server-2.6.1
mongodb-enterprise-shell-Pin a specific version of MongoDB Enterprise. Although
you can specify any available version of MongoDB Enterprise, yum will upgrade
the packages when a newer version becomes available. To prevent unintended
upgrades, pin the package. To pin a package, add the following exclude
directive to your /etc/yum.conf file:
exclude=mongodb-enterprise,mongodb-enterprise-server,mongodb-enterprise-shell,mongodb-enterprise-mongos,Previous
versions of MongoDB packages use different naming conventions.
Step 3: When the install completes, you
can run MongoDB.
Run MongoDB Enterprise
Important: You
must configure SELinux to allow MongoDB to start on Red Hat Linux-based systems
(Red Hat
Enterprise
Linux, CentOS, Fedora). Administrators have three options:
• enable access
to the relevant ports for SELinux. For default settings, this can be
accomplished by running
semanage port -a -t mongodb_port_t
-p tcp 27017
• set SELinux
to permissive mode in /etc/selinux.conf. The line
SELINUX=enforcing
should be
changed to
SELINUX=permissive
• disable SELinux entirely; as above but set
SELINUX=disabled
All three
options require root privileges. The latter two options each requires a system
reboot and may have larger implications for your deployment.
150. Explain start & stop process of MongoDB ?
Step 1: Start MongoDB. You can start the
mongod process by issuing the following command:
sudo
service mongod start
Step 2: Verify that MongoDB has started
successfully You can verify that the mongod process has started successfully
by checking the
contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port
<port>
where
<port> is the port configured in /etc/mongod.conf, 27017 by
default.
You can
optionally ensure that MongoDB will start following a system reboot by issuing
the following command:
sudo
chkconfig mongod on
Step 3: Stop MongoDB. As needed, you can
stop the mongod process by issuing the following command:
sudo
service mongod stop
Step 4: Restart MongoDB. You can restart the
mongod process by issuing the following command:
sudo
service mongod restart
You can follow
the state of the process for errors or important messages by watching the
output in the
/var/log/mongodb/mongod.log
file.
Step 5: Begin using MongoDB.
151.
What is default port of MongoDB and in which configuration file.
Port configured in /etc/mongod.conf, 27017 by default.
152. Which format used to store data in MongoDB ?
MongoDB doesn’t
actually use JSON to store the data; rather, it uses an open data format
developed by the MongoDB team called BSON
(pronounced Bee-Son), which is short for Binary-JSON. BSON makes MongoDB
even faster by making it much easier for a computer to process and search
documents. BSON also adds a couple of features that aren’t available in
standard JSON, including the ability to add types for handling binary data.
153.
Explain & Compare JSON & BSON ?
JSON allows
complex data structures to be represented in a simple, human-readable text
format that is generally considered to be much easier to read and understand
than XML. Like XML, JSON was envisaged as a way to exchange data between a web
client (such as a browser) and web applications.
BSON is much
easier to traverse (i.e., to look through) and index very quickly. Although BSON
requires slightly more disk space than JSON, this extra space is unlikely
to be a problem because disks are cheap, and MongoDB can scale across machines.
The second key
benefit to using BSON is that it is easy and quick to convert BSON to a
programming language’s native data format. If the data were stored in pure
JSON, a relatively high-level conversion would need to take place. There are
MongoDB drivers for a large number of programming languages (such as Python,
Ruby, PHP, C, C++ and C#), and each works slightly differently. Using a simple
binary format, native data structures can be quickly built for each language,
without requiring that you first process JSON. This makes the code simpler and
faster, both of which are in keeping with MongoDB’s stated goals.
BSON also
provides some extensions to JSON. For example, it enables you to store binary
data and incorporates a specific date type. Thus, while BSON can store any JSON
document, a valid BSON document may not be valid JSON. This doesn’t
matter because each language has its own driver that converts data to and from
BSON without needing to use JSON as an intermediary language.
154. Advantages of MongoDB over RDBMS
Schema less :
MongoDB is document database in which one collection holds different different
documents.Number of fields, content and size of the document can be differ from
one document to another.
Structure of
a single object is clear
No complex
joins
Deep
query-ability. MongoDB supports dynamic queries on documents using a
document-based query language that's nearly as powerful as SQL Tuning
Ease of
scale-out: MongoDB is easy to scale
Conversion /
mapping of application objects to database objects not needed
Uses internal
memory for storing the (windowed) working set, enabling faster access of data
155.
Feature List of MongoDB ?
Using Document-Orientated Storage (BSON)
Supporting Dynamic Queries
Indexing Your Documents
Leveraging Geospatial Indexes
Profiling Queries
Updating Information In-Place
Storing Binary Data
Replicating Data
Implementing Auto Sharding
Using Map and Reduce Functions
156. Explain Version Numbers of MongoDB
MongoDB uses
the “odd-numbered
versions for development releases” approach. In other words, you can
tell by looking at the second number of the version number (also called the
release number) whether a version is a development
version or a stable version. If the
second number is even, then it’s a stable release. If the second number is an
odd number, then it’s an unstable, or development, release.
Let’s take a
closer look at the three digits included in a version number’s three parts, A,
B, and C:
A, the first
(or left-most) number: Represents the major version and only changes
when there is a
full version upgrade.
B, the second
(or middle) number: Represents the release number and indicates
whether a
version is a development version or a stable version. If the number is
even, the
version is stable; if the number is odd, then the version is unstable and
considered a
development release.
C, the third
(or right-most) number: Represents the revision number; this is used
for bugs and
security issues.
For example, at
the time of writing, the following versions were available from the
MongoDB website:
1.6.1
(Production release)
1.4.4 (Previous
release)
1.7.0-pre (Development release)
157. Installation Layout of MongoDB ?
After you
install or extract MongoDB successfully, you will have the applications shown
in Figure available in the bin directory (in both Linux and Windows).
|-- bin
| |-- mongo
(the database shell)
| |-- mongod
(the core database server)
| |-- mongos (auto-sharding
process)
| |-- mongodump
(dump/export utility)
| `-- mongorestore (restore/import utility)
The installed
software includes five applications that you will be using in conjunction with
your MongoDB databases. The two “most important” applications are the mongo and
mongod applications.
The mongo application allows you to use the
database shell; this shell enables you to accomplish practically anything you’d
want to do with MongoDB.
The mongod application starts the service
or daemon, as it’s also called. There are also many flags you can set when
launching the MongoDB applications.
For example, the service lets you specify the path where the database is
located (--dbpath), show version information (--version), and even print some
diagnostic system information (with the --sysinfo flag)!
You can view the entire list of options by including the --help flag when you
launch the service. For now, you can just use the defaults and start the
service by typing mongod in your shell or command prompt.
Example of our test server
/usr/bin
--mongotop
--mongostat
--mongorestore
--mongoperf
--mongooplog
--mongoimport
--mongodump
--mongo
--mongos
--mongofiles
--mongoexport
--mongod
158. What is test database in MongoDB ?
If you start
the MongoDB service with the default parameters, and start the shell with the
default settings, then you will be connected to the default test database running on your local host. This database is
created automatically the moment you connect to it. This is one of MongoDB’s
most powerful features: if you attempt to connect to a database that
does not exist, MongoDB will automatically create it for you.
159.
Explain _id Field MongoDB ?
Every object
within the MongoDB database contains a unique identifier to distinguish that
object from every other object. This unique identifier is called the _id key, and it is added automatically
to every document you create in a collection.
The _id key is
the first attribute added in each new document you create. This remains true
even if you do not tell MongoDB to create this key.
_id is a 12
bytes hexadecimal number which assures the uniqueness of every document. You
can provide _id while inserting the document. If you didn't provide then
MongoDB provide a unique id for every document.
If you do not
specify the _id value manually, then the type will be set to a special BSON
datatype that consists of a 12-byte binary value.
The 12-byte value consist of a 4-byte
timestamp (seconds since epoch), a 3-byte machine id, a 2-byte process id, and
a 3-byte counter. It’s good to know that the counter and timestamp fields are
stored in Big Endian. This is because MongoDB wants to ensure that
there is an increasing order to these values, and a Big Endian approach suits
this requirement best.
Every
additional supported driver that you load when working with MongoDB (such as
the PHP driver or the Python driver) supports this special BSON datatype and
uses it whenever new data is created.
You can also invoke ObjectId() from
the MongoDB shell to create a value for an _id key.
Optionally, you
can specify your own value by using ObjectId(string),
where string represents the specified hex string.
160.
Explain Big Endian and Little Endian ?
Big Endian and
Little Endian refer to how each individual bytes/bits are stored in a longer
data word in the memory. Big Endian simply means that the highest value gets
saved first. Similarly, Little Endian means that the smallest value gets saved
first.
161.
What are the datatypes can be use in document of MongoDB ?
Possible types
of data you can add to a document, and what you use them for:
String: This commonly used datatype
contains a string of text (or any other kind of
characters).
This datatype is used mostly for storing text values (e.g., "Country"
:
"Japan"}.
Integer (32b and 64b): This type is
used to store a numerical value (e.g., { "Rank" : 1 } ). Note that
there are no quotes placed before or after the integer.
Boolean: This datatype can be set to either
TRUE or FALSE.
Double: This datatype is used to store
floating point values.
Min / Max keys: This datatype is used to compare a
value against the lowest and
highest BSON
elements, respectively.
Arrays: This datatype is used to store
arrays (e.g., ["Membrey, Peter","Plugge,
Eelco","Hawkins,
Tim"]).
Timestamp: This datatype is used to store a
timestamp. This can be handy for
recording when
a document has been modified or added.
Object: This datatype is used for embedded
documents.
Null: This datatype is used for a Null
value.
Symbol: This datatype is used identically
to a string (see above); however, it’s
generally
reserved for languages that use a specific symbol type.
Date *: This datatype is used to store the
current date or time in UNIX time format
(POSIX time).
Object ID *: This datatype is used to store the
document’s ID.
Binary
data *: This datatype is used to store binary data.
Regular expression *: This datatype
is used for regular expressions. All options are represented by specific
characters provided in alphabetical order.
JavaScript Code *: This datatype is used for
JavaScript code.
The last five
datatypes (date, object id, binary data, regex, and JavaScript code) are
non-JSON
datatypes; specifically, they are special datatypes that
BSON allows you to use.
162. File systems
snapshots for MONGODB backup?
File systems snapshots are an operating system volume
manager feature, and are not specific to MongoDB. The mechanics of snapshots
depend on the underlying storage system. For example, if you use Amazon’s EBS
storage system for EC2 supports snapshots. On Linux the LVM manager can create
a snapshot.
To get a correct snapshot of a running mongod process, you
must have journaling enabled and the journal must reside on the same logical
volume as the other MongoDB data files. Without journaling enabled, there is no
guarantee that
the snapshot will be consistent or valid.
To get a consistent snapshot of a sharded system, you must
disable the balancer and capture a snapshot from every shard and a config
server at approximately the same moment in time.
163. Backup with
mongodump. Also mention pros & cond ?
The mongodump tool reads data from a MongoDB database and
creates high fidelity BSON files. The mongorestore tool can populate a MongoDB
database with the data from these BSON files. These tools are simple and
efficient for backing up small MongoDB deployments, but are not ideal for
capturing backups of larger systems.
mongodump and mongorestore can operate against a running
mongod process, and can manipulate the underlying data files directly. By
default, mongodump does not capture the contents of the local database.
--mongodump only captures the documents in the database. The
resulting backup is space efficient, but mongorestore or mongod must rebuild
the indexes after restoring data.
--When connected to a MongoDB instance, mongodump can
adversely affect mongod performance. If your data is larger than system memory,
the queries will push the working set out of memory.
To mitigate the impact of mongodump on the performance of
the replica set, use mongodump to capture backups from a secondary member of a replica set. Alternatively, you
can shut down a secondary and use mongodump with the data files directly. If
you shut down a secondary to capture data with mongodump ensure that the operation
can complete before its oplog becomes too stale to continue replicating.
For replica sets, mongodump also supports a point in time
feature with the --oplog option. Applications may continue modifying data while
mongodump captures the output. To restore a point in time backup created with
--oplog, use mongorestore with the --oplogReplay option.
If applications
modify data while mongodump is creating a backup, mongodump will compete for
resources with those applications.
164. MongoDB Reporting Tools
This section
provides an overview of the reporting methods distributed with MongoDB.
Utilities The MongoDB distribution includes a number of utilities that quickly
return statistics about instances’ performance and activity. Typically, these
are most useful for diagnosing issues and assessing normal operation.
mongostat mongostat
captures and returns the counts of database operations by type (e.g. insert,
query, update, delete, etc.). These counts report on the load distribution on
the server. Use mongostat to understand the distribution of operation types and
to inform capacity planning.
mongotop mongotop
tracks and reports the current read and write activity of a MongoDB instance,
and reports these statistics on a per collection basis.
Use mongotop to
check if your database activity and use match your expectations.
REST Interface MongoDB provides a simple
REST interface that can be useful for configuring monitoring and alert scripts,
and for other administrative tasks.
To enable,
configure mongod to use REST, either by starting mongod with the --rest
option, or by setting the net.http.RESTInterfaceEnabled
setting to true in a configuration file.
HTTP Console MongoDB provides a web interface
that exposes diagnostic and monitoring information in a simple web page. The
web interface is accessible at localhost:<port>, where the <port>
number is 1000 more than the mongod port .
For example, if
a locally running mongod is using the default port 27017, access the HTTP
console at http://localhost:28017.
Commands MongoDB includes a number of
commands that report on the state of the database.
These data may
provide a finer level of granularity than the utilities discussed above.
Consider using their output in scripts and programs to develop custom alerts,
or to modify the behavior of your application in response to the activity of
your instance. The db.currentOp method is another useful tool for identifying the
database instance’s in-progress operations.
serverStatus The serverStatus command, or db.serverStatus()
from the shell, returns a general overview of the status of the database,
detailing disk usage, memory use, connection, journaling, and index access.
The command
returns quickly and does not impact MongoDB performance.
serverStatus
outputs an account of the state of a MongoDB instance. This command is rarely
run directly. In most cases, the data is more meaningful when aggregated, as
one would see with monitoring tools including MMS12 .
Nevertheless,
all administrators should be familiar with the data provided by serverStatus.
dbStats The dbStats command, or db.stats()
from the shell, returns a document that addresses storage
use and data volumes. The dbStats reflect the amount of storage used, the
quantity of data contained in the database, and object, collection, and index
counters.
Use this data
to monitor the state and storage capacity of a specific database. This output
also allows you to compare use between databases and to determine the average
document size in a database.
collStats The collStats provides statistics
that resemble dbStats on the collection level, including a count of the objects
in the collection, the size of the collection, the amount of disk space used by
the collection, and information about its indexes.
replSetGetStatus The replSetGetStatus command
(rs.status() from the shell) returns an overview of your replica set’s status.
The replSetGetStatus document details the state and configuration of the
replica set and statistics about its members. Use this data to ensure that
replication is properly configured, and to check the connections between the
current host and the other members of the replica set.
Third Party
Tools A number of third party monitoring tools have support for MongoDB, either
directly, or through their own plugins.
165.Run Multiple Database Instances on the Same System
In many cases
running multiple instances of mongod on a single system is not recommended. On
some types of deployments 60 and for testing purposes you may need to run more
than one mongod on a single system.
In these cases,
use a base configuration for each instance, but consider the following
configuration values:
dbpath = /srv/mongodb/db0/
pidfilepath = /srv/mongodb/db0.pid
The dbPath
value controls the location of the mongod instance’s data directory. Ensure
that each database has a distinct and well labeled data directory. The
pidFilePath controls where mongod process places it’s process id file. As this
tracks the specific mongod file, it is crucial that file be unique and well
labeled to make it easy to start and stop these processes.
Create
additional control scripts and/or adjust your existing MongoDB configuration
and control script as needed to control these processes.
166. What is Diagnostic Configurations for performance issue.
The following configuration
options control various mongod behaviors for diagnostic purposes. The following
settings have default values that tuned for general production purposes:
slowms = 50
profile = 3
verbose = true
objcheck = true
Use the base
configuration and add these options if you are experiencing some unknown issue
or performance problem as needed:
•
slowOpThresholdMs configures the threshold for to consider a query “slow,” for
the purpose of the logging system and the database profiler. The default value
is 100 milliseconds. Set a lower value if the database profiler does not return
useful results, or a higher value to only log the longest running queries.
• mode sets the
database profiler level. The profiler is not active by default because of the
possible impact on the profiler itself on performance. Unless this setting has
a value, queries are not profiled.
• verbosity
controls the amount of logging output that mongod write to the log. Only use
this option if you are experiencing an issue that is not reflected in the
normal logging level.
•
wireObjectCheck forces mongod to validate all requests from clients upon
receipt. Use this option to ensure that invalid requests are not causing
errors, particularly when running a database with untrusted clients.
This option may
affect database performance.
167. MongoDB Performance Monitoring by OS commands ?
iostat On
Linux, use the iostat command to check if disk I/O is a bottleneck for your
database. Specify a number of seconds when running iostat to avoid displaying
stats covering the time since server boot.
For example,
the following command will display extended statistics and the time for each
displayed report, with traffic in MB/s, at one second intervals:
iostat
-xmt 1
Key fields from iostat:
• %util: this is the most useful field for a
quick check, it indicates what percent of the time the device/drive is in use.
• avgrq-sz: average request size. Smaller
numbers for this value reflect more random IO operations.
bwm-ng
bwm-ng71 is a command-line tool for monitoring network use. If you suspect a
network-based bottleneck, you may use bwm-ng to begin your diagnostic process.
168. What is Connection Pools and use?
To avoid
overloading the connection resources of a single mongod or mongos instance,
ensure that clients maintain reasonable connection pool sizes.
The connPoolStats
database command returns information regarding the number of open connections
to the current database for mongos instances and mongod instances in sharded
clusters.
169. Is it authorization enabled in MongoDB ?
By default,
authorization is not enabled and mongod assumes a trusted environment. You can
enable security/auth mode if you need it.
170. Collection Export with mongoexport
With the mongoexport
utility you can create a backup file. In the most simple invocation, the
command takes the following form:
mongoexport --collection collection
--out collection.json
This will
export all documents in the collection named collection into the file
collection.json. Without the output specification (i.e. “--out
collection.json”), mongoexport writes output to standard output (i.e.
“stdout”). You can further narrow the
results by supplying a query filter using the “--query” and limit results to a
single database using the “--db” option. For instance:
mongoexport --db sales --collection
contacts --query '{"field": 1}'
This command
returns all documents in the sales database’s contacts collection, with a field
named field with a value of 1. Enclose the query in single quotes (e.g. ’) to
ensure that it does not interact with your shell environment. The resulting
documents will return on standard output.
By default, mongoexport returns one JSON document per
MongoDB document. Specify the “--jsonArray”
argument to return the export as a single JSON array. Use the “--csv”
file to return the result in CSV (comma separated values) format.
If your mongod
instance is not running, you can use the “--dbpath”
option to specify the location to your MongoDB instance’s database files. See
the following example:
mongoexport --db sales --collection
contacts --dbpath /srv/MongoDB/
This reads the
data files directly. This locks the data directory to prevent conflicting
writes. The mongod process must not be running or attached to these data files
when you run mongoexport in this configuration.
The “--host”
and “--port” options allow you to specify a
non-local host to connect to capture the export. Consider the following
example:
mongoexport --host
mongodb1.example.net --port 37017 --username user --password pass --collection
contacts
On any
mongoexport command you may, as above specify username and password credentials
as above.
171. Collection Import with mongoimport
To restore a
backup taken with mongoexport. Most of the arguments to mongoexport also exist
for mongoimport. Consider the following command:
mongoimport --collection collection
--file collection.json
This imports
the contents of the file collection.json into the collection named collection.
If you do not specify a file with the “--file”
option, mongoimport accepts input over standard input (e.g. “stdin.”)
If you specify
the “--upsert” option, all of mongoimport
operations will attempt to update existing documents in the database and insert
other documents. This option will cause some performance impact depending on
your configuration.
You can specify
the database option --db
to import these documents to a particular database. If your MongoDB instance is
not running, use the “--dbpath”
option to specify the location of your MongoDB instance’s database files.
Consider using the “--journal”
option to ensure that mongoimport records its operations in the journal.
The mongod
process must not be running or attached to these data files when you run
mongoimport in this configuration.
Use the “--ignoreBlanks”
option to ignore blank fields. For CSV and TSV imports, this option provides
the desired functionality in most cases: it avoids inserting blank fields in
MongoDB documents.
172. What are parts of GridFS?
GridFS consists
of two parts. More specifically, it consists of two collections. One collection
holds the filename and related information such as size (called metadata),
while the other collection holds the file data itself, usually in 256k chunks.
The specification calls for these to be named files and chunks respectively. By
default, the files and chunks collections are created in the fs namespace, but
this can be
changed. The
ability to change the default namespace is useful if you want to store
different types of files. For example, you might want to keep image and movie
files separate.
173. How to limit number of items added into capped collection
? how works ?
You can also
limit the number of items added into a capped collection using the max:
parameter when you create the collection. However, you must take care that you
ensure that there is enough space in the collection for the number of items you
want to add. If the collection becomes full before the number of items has been
reached, the oldest item in the collection will be removed.
The MongoDB
shell includes a utility that lets you see the amount of space used by an
existing collection, whether it’s capped or uncapped. You invoke this utility
using the validate() function. This can be particularly useful if you want to
estimate how large a collection might become.
174. What are the limitation with capped collection regarding
update and delete operation ?
Documents
already added to a capped collection can be updated, but they must not grow in
size. The update will fail if they do.
Deleting
documents from a capped collection is also not possible; instead, the entire
collection must
be dropped and re-created if you want to do this.
No comments:
Post a Comment