bson_index man page

bson_index — Libbson

A Cross Platform BSON Library for C

Introduction

libbson builds, parses, and iterates BSON documents, the native data format of MongoDB. It also converts BSON to and from Json, and provides a platform compatibility layer for the MongoDB C Driver.

Installing libbson

The following guide will step you through the process of downloading, building, and installing the current release of libbson.

Supported Platforms

The library is continuously tested on GNU/Linux, Windows 7, Mac OS X 10.10, and Solaris 11 (Intel and Sparc), with GCC, Clang, and Visual Studio 2010, 2013, and 2015.

The library supports the following operating systems and CPU architectures:

Operating Systems CPU Architectures Compiler Toolchain
GNU/Linux x86 and x86_64 GCC 4.1 and newer
Solaris 11 ARM Clang 3.3 and newer
Mac OS X 10.6 and newer PPC Microsoft Visual Studio 2010 and newer
Windows Vista, 7, and 8 SPARC Oracle Solaris Studio 12
FreeBSD MinGW

Install with a Package Manager

The libbson package is available on recent versions of Debian and Ubuntu.

$ apt-get install libbson-1.0

On Fedora, a libbson package is available in the default repositories and can be installed with:

$ dnf install libbson

On recent Red Hat systems, such as CentOS and RHEL 7, a libbson package is available in the EPEL repository. To check version available, see https://apps.fedoraproject.org/packages/libbson. The package can be installed with:

$ yum install libbson

Installing from Source

The following instructions are for UNIX-like systems such as GNU/Linux, FreeBSD, and Solaris. To build on Windows, see the instructions for Building on Windows.

The most recent release of libbson is 1.6.2 and can be downloaded here. The following snippet will download and extract the current release of the driver.

$ wget https://github.com/mongodb/mongo-c-driver/releases/download/1.6.2/mongo-c-driver-1.6.2.tar.gz
$ tar -xzf libbson-1.6.2.tar.gz
$ cd libbson-1.6.2/

Minimal dependencies are needed to build libbson. On UNIX-like systems, pthreads (the POSIX threading library) is required.

Make sure you have access to a supported toolchain such as GCC, Clang, SolarisStudio, or MinGW. Optionally, pkg-config can be used if your system supports it to simplify locating proper compiler and linker arguments when compiling your program.

The following will configure for a typical 64-bit Linux system such as RedHat Enterprise Linux 6 or CentOS 6. Note that not all systems place 64-bit libraries in /usr/lib64. Check your system to see what the convention is if you are building 64-bit versions of the library.

$ ./configure --prefix=/usr --libdir=/usr/lib64

For a list of all configure options, run ./configure --help.

If configure completed successfully, you'll see something like the following describing your build configuration.

libbson 1.6.2 was configured with the following options:

Build configuration:
  Enable debugging (slow)                          : no
  Enable extra alignment (required for 1.0 ABI)    : no
  Compile with debug symbols (slow)                : no
  Enable GCC build optimization                    : yes
  Code coverage support                            : no
  Cross Compiling                                  : no
  Big endian                                       : no
  Link Time Optimization (experimental)            : no

Documentation:
  man                                              : no
  HTML                                             : no

We can now build libbson with the venerable make program.

$ make

To install the driver, we use make with the install target.

$ sudo make install

Building on Windows

Building on Windows requires Windows Vista or newer and Visual Studio 2010 or newer. Additionally, cmake is required to generate Visual Studio project files.

Let's start by generating Visual Studio project files for libbson. The following assumes we are compiling for 64-bit Windows using Visual Studio 2010 Express which can be freely downloaded from Microsoft.

> cd libbson-1.6.2
> cmake -G "Visual Studio 14 2015 Win64" \
  "-DCMAKE_INSTALL_PREFIX=C:\libbson"
> msbuild.exe ALL_BUILD.vcxproj
> msbuild.exe INSTALL.vcxproj

You should now see libbson installed in C:\libbson

You can disable building the tests with:

> cmake -G "Visual Studio 14 2015 Win64" \
  "-DCMAKE_INSTALL_PREFIX=C:\libbson" \
  "-DENABLE_TESTS:BOOL=OFF"

Tutorial

Creating a BSON Document

The bson_t structure

BSON documents are created using the bson_t structure. This structure encapsulates the necessary logic for encoding using the BSON Specification. At the core, bson_t is a buffer manager and set of encoding routines.

Let's start by creating a new BSON document on the stack. Whenever using libbson, make sure you #include <bson.h>.

bson_t b;

bson_init (&b);

This creates an empty document. In Json, this would be the same as {}.

We can now proceed to adding items to the BSON document. A variety of functions prefixed with bson_append_ can be used based on the type of field you want to append. Let's append a Utf-8 encoded string.

bson_append_utf8 (&b, "key", -1, "value", -1);

Notice the two -1 parameters. The first indicates that the length of key in bytes should be determined with strlen(). Alternatively, we could have passed the number 3. The same goes for the second -1, but for value.

Libbson provides macros to make this less tedious when using string literals. The following two appends are identical.

bson_append_utf8 (&b, "key", -1, "value", -1);
BSON_APPEND_UTF8 (&b, "key", "value");

Now let's take a look at an example that adds a few different field types to a BSON document.

bson_t b = BSON_INITIALIZER;

BSON_APPEND_INT32 (&b, "a", 1);
BSON_APPEND_UTF8 (&b, "hello", "world");
BSON_APPEND_BOOL (&b, "bool", true);

Notice that we omitted the call to bson_init(). By specifying BSON_INITIALIZER we can remove the need to initialize the structure to a base state.

Sub-Documents and Sub-Arrays

To simplify the creation of sub-documents and arrays, bson_append_document_begin() and bson_append_array_begin() exist. These can be used to build a sub-document using the parent documents memory region as the destination buffer.

bson_t parent;
bson_t child;
char *str;

bson_init (&parent);
bson_append_document_begin (&parent, "foo", 3, &child);
bson_append_int32 (&child, "baz", 3, 1);
bson_append_document_end (&parent, &child);

str = bson_as_json (&parent, NULL);
printf ("%s\n", str);
bson_free (str);

bson_destroy (&parent);
{ "foo" : { "baz" : 1 } }

Simplified BSON C Object Notation

Creating BSON documents by hand can be tedious and time consuming. BCON, or BSON C Object Notation, was added to allow for the creation of BSON documents in a format that looks closer to the destination format.

The following example shows the use of BCON. Notice that values for fields are wrapped in the BCON_* macros. These are required for the variadic processor to determine the parameter type.

bson_t *doc;

doc = BCON_NEW ("foo",
                "{",
                "int",
                BCON_INT32 (1),
                "array",
                "[",
                BCON_INT32 (100),
                "{",
                "sub",
                BCON_UTF8 ("value"),
                "}",
                "]",
                "}");

Creates the following document

{ "foo" : { "int" : 1, "array" : [ 100, { "sub" : "value" } ] } }

Handling Errors

Description

Many libbson functions report errors by returning NULL or -1 and filling out a bson_error_t structure with an error domain, error code, and message.

  • error.domain names the subsystem that generated the error.
  • error.code is a domain-specific error type.
  • error.message describes the error.

Some error codes overlap with others; always check both the domain and code to determine the type of error.

BSON_ERROR_JSON BSON_JSON_ERROR_READ_CORRUPT_JS BSON_JSON_ERROR_READ_INVALID_PARAM BSON_JSON_ERROR_READ_CB_FAILURE bson_json_reader_t tried to parse invalid MongoDB Extended Json. Tried to parse a valid Json document that is invalid as MongoDBExtended Json. An internal callback failure during Json parsing.
BSON_ERROR_READER BSON_ERROR_READER_BADFD bson_json_reader_new_from_file could not open the file.

ObjectIDs

Libbson provides a simple way to generate ObjectIDs. It can be used in a single-threaded or multi-threaded manner depending on your requirements.

The bson_oid_t structure represents an ObjectI in MongoDB. It is a 96-bit identifier that includes various information about the system generating the OID.

Composition

  • 4 bytes : The UNIX timestamp in big-endian format.
  • 3 bytes : The first 3 bytes of MD5(hostname).
  • 2 bytes : The pid_t of the current process. Alternatively the task-id if configured.
  • 3 bytes : A 24-bit monotonic counter incrementing from rand() in big-endian.

Sorting ObjectIDs

The typical way to sort in C is using qsort(). Therefore, Libbson provides a qsort() compatible callback function named bson_oid_compare(). It returns less than 1, greater than 1, or 0 depending on the equality of two bson_oid_t structures.

Comparing Object IDs

If you simply want to compare two bson_oid_t structures for equality, use bson_oid_equal().

Generating

To generate a bson_oid_t, you may use the following.

bson_oid_t oid;

bson_oid_init (&oid, NULL);

Parsing ObjectID Strings

You can also parse a string contianing a bson_oid_t. The input string MUST be 24 characters or more in length.

bson_oid_t oid;

bson_oid_init_from_string (&oid, "123456789012345678901234");

If you need to parse may bson_oid_t in a tight loop and can guarantee the data is safe, you might consider using the inline variant. It will be inlined into your code and reduce the need for a foreign function call.

bson_oid_t oid;

bson_oid_init_from_string_unsafe (&oid, "123456789012345678901234");

Hashing ObjectIDs

If you need to store items in a hashtable, you may want to use the bson_oid_t as the key. Libbson provides a hash function for just this purpose. It is based on DJB hash.

unsigned hash;

hash = bson_oid_hash (oid);

Fetching ObjectID Creation Time

You can easily fetch the time that a bson_oid_t was generated using bson_oid_get_time_t().

time_t t;

t = bson_oid_get_time_t (oid);
printf ("The OID was generated at %u\n", (unsigned) t);

Parsing and Iterating BSON Documents

Parsing

BSON documents are lazily parsed as necessary. To begin parsing a BSON document, use one of the provided Libbson functions to create a new bson_t from existing data such as bson_new_from_data(). This will make a copy of the data so that additional mutations may occur to the BSON document.

bson_t *b;

b = bson_new_from_data (my_data, my_data_len);
if (!b) {
   fprintf (stderr, "The specified length embedded in <my_data> did not match "
                    "<my_data_len>\n");
   return;
}

bson_destroy (b);

Only two checks are performed when creating a new bson_t from an existing buffer. First, the document must begin with the buffer length, matching what was expected by the caller. Second, the document must end with the expected trailing \0 byte.

To parse the document further we use a bson_iter_t to iterate the elements within the document. Let's print all of the field names in the document.

bson_t *b;
bson_iter_t iter;

if ((b = bson_new_from_data (my_data, my_data_len))) {
   if (bson_iter_init (&iter, b)) {
      while (bson_iter_next (&iter)) {
         printf ("Found element key: \"%s\"\n", bson_iter_key (&iter));
      }
   }
   bson_destroy (b);
}

Converting a document to Json uses a bson_iter_t and bson_visitor_t to iterate all fields of a BSON document recursively and generate a Utf-8 encoded Json string.

bson_t *b;
char *json;

if ((b = bson_new_from_data (my_data, my_data_len))) {
   if ((json = bson_as_json (b, NULL))) {
      printf ("%s\n", json);
      bson_free (json);
   }
   bson_destroy (b);
}

Recursing into Sub-Documents

Libbson provides convenient sub-iterators to dive down into a sub-document or sub-array. Below is an example that will dive into a sub-document named "foo" and print it's field names.

bson_iter_t iter;
bson_iter_t *child;
char *json;

if (bson_iter_init_find (&iter, doc, "foo") &&
    BSON_ITER_HOLDS_DOCUMENT (&iter) && bson_iter_recurse (&iter, &child)) {
   while (bson_iter_next (&child)) {
      printf ("Found sub-key of \"foo\" named \"%s\"\n",
              bson_iter_key (&child));
   }
}

Finding Fields using Dot Notation

Using the bson_iter_recurse() function exemplified above, bson_iter_find_descendant() can find a field for you using the MongoDB style path notation such as "foo.bar.0.baz".

Let's create a document like {"foo": {"bar": [{"baz: 1}]}} and locate the "baz" field.

bson_t *b;
bson_iter_t iter;
bson_iter_t baz;

b =
   BCON_NEW ("foo", "{", "bar", "[", "{", "baz", BCON_INT32 (1), "}", "]", "}");

if (bson_iter_init (&iter, b) &&
    bson_iter_find_descendant (&iter, "foo.bar.0.baz", &baz) &&
    BSON_ITER_HOLDS_INT32 (&baz)) {
   printf ("baz = %d\n", bson_iter_int32 (&baz));
}

bson_destroy (b);

Validating a BSON Document

If all you want to do is validate that a BSON document is valid, you can use bson_validate().

size_t err_offset;

if (!bson_validate (doc, BSON_VALIDATE_NONE, &err_offset)) {
   fprintf (stderr,
            "The document failed to validate at offset: %u\n",
            (unsigned) err_offset);
}

See the bson_validate() documentation for more information and examples.

Utf-8

Encoding

Libbson expects that you are always working with Utf-8 encoded text. Anything else is invalid API use.

If you should need to walk through Utf-8 sequences, you can use the various Utf-8 helper functions distributed with Libbson.

Validating a UTF-8 Sequence

To validate the string contained in my_string, use the following. You may pass -1 for the string length if you know the string is NULL-terminated.

if (!bson_utf8_validate (my_string, -1, false)) {
   printf ("Validation failed.\n");
}

If my_string has NULL bytes within the string, you must provide the string length. Use the following format. Notice the true at the end indicationg \0 is allowed.

if (!bson_utf8_validate (my_string, my_string_len, true)) {
   printf ("Validation failed.\n");
}

For more information see the API reference for bson_utf8_validate().

Guides

Streaming BSON

bson_reader_t provides a streaming reader which can be initialized with a filedescriptor or memory region. bson_writer_t provides a streaming writer which can be initialized with a memory region. (Streaming BSON to a file descriptor is not yet supported.)

Reading from a BSON Stream

bson_reader_t provides a convenient API to read sequential BSON documents from a file-descriptor or memory buffer. The bson_reader_read() function will read forward in the underlying stream and returna bson_t that can be inspected and iterated upon.

#include <stdio.h>
#include <bson.h>

int
main (int argc, char *argv[])
{
   bson_reader_t *reader;
   const bson_t *doc;
   bson_error_t error;
   bool eof;

   reader = bson_reader_new_from_file ("mycollection.bson", &error);

   if (!reader) {
      fprintf (stderr, "Failed to open file.\n");
      return 1;
   }

   while ((doc = bson_reader_read (reader, &eof))) {
      char *str = bson_as_json (doc, NULL);
      printf ("%s\n", str);
      bson_free (str);
   }

   if (!eof) {
      fprintf (stderr,
               "corrupted bson document found at %u\n",
               (unsigned) bson_reader_tell (reader));
   }

   bson_reader_destroy (reader);

   return 0;
}

See bson_reader_new_from_fd(), bson_reader_new_from_file(), and bson_reader_new_from_data() for more information.

Writing a sequence of BSON Documents

bson_writer_t provides a convenient API to write a sequence of BSON documents to a memory buffer that can grow with realloc(). The bson_writer_begin() and bson_writer_end() functions will manage the underlying buffer while building the sequence of documents.

This could also be useful if you want to write to a network packet while serializing the documents from a higher level language, (but do so just after the packets header).

#include <stdio.h>
#include <bson.h>
#include <assert.h>

int
main (int argc, char *argv[])
{
   bson_writer_t *writer;
   bson_t *doc;
   uint8_t *buf = NULL;
   size_t buflen = 0;
   bool r;
   int i;

   writer = bson_writer_new (&buf, &buflen, 0, bson_realloc_ctx, NULL);

   for (i = 0; i < 10000; i++) {
      r = bson_writer_begin (writer, &doc);
      assert (r);

      r = BSON_APPEND_INT32 (doc, "i", i);
      assert (r);

      bson_writer_end (writer);
   }

   bson_free (buf);

   return 0;
}

See bson_writer_new() for more information.

Json

Libbson provides routines for converting to and from the Json format. In particular, it supports the MongoDB extended Json format.

Converting BSON to JSON

There are often times where you might want to convert a BSON document to Json. It is convenient for debugging as well as an interchange format. To help with this, Libbson contains the function bson_as_json().

bson_t *b;
size_t len;
char *str;

b = BCON_NEW ("a", BCON_INT32 (1));

str = bson_as_json (b, &len);
printf ("%s\n", str);
bson_free (str);

bson_destroy (b);
{ "a" : 1 }

Converting JSON to BSON

Converting back from Json is also useful and common enough that we added bson_init_from_json() and bson_new_from_json().

The following example creates a new bson_t from the Json string {"a":1}.

bson_t *b;
bson_error_t error;

b = bson_new_from_json ("{\"a\":1}", -1, &error);

if (!b) {
   printf ("Error: %s\n", error.message);
} else {
   bson_destroy (b);
}

Streaming JSON Parsing

Libbson provides bson_json_reader_t to allow for parsing a sequence of Json documents into BSON. The interface is similar to bson_reader_t but expects the input to be in the MongoDB extended Json format.

/*
 * Copyright 2013 MongoDB, Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */


/*
 * This program will print each JSON document contained in the provided files
 * as a BSON string to STDOUT.
 */


#include <bson.h>
#include <stdlib.h>
#include <stdio.h>


int
main (int argc, char *argv[])
{
   bson_json_reader_t *reader;
   bson_error_t error;
   const char *filename;
   bson_t doc = BSON_INITIALIZER;
   int i;
   int b;

   /*
    * Print program usage if no arguments are provided.
    */
   if (argc == 1) {
      fprintf (stderr, "usage: %s FILE...\n", argv[0]);
      return 1;
   }

   /*
    * Process command line arguments expecting each to be a filename.
    */
   for (i = 1; i < argc; i++) {
      filename = argv[i];

      /*
       * Open the filename provided in command line arguments.
       */
      if (0 == strcmp (filename, "-")) {
         reader = bson_json_reader_new_from_fd (STDIN_FILENO, false);
      } else {
         if (!(reader = bson_json_reader_new_from_file (filename, &error))) {
            fprintf (
               stderr, "Failed to open \"%s\": %s\n", filename, error.message);
            continue;
         }
      }

      /*
       * Convert each incoming document to BSON and print to stdout.
       */
      while ((b = bson_json_reader_read (reader, &doc, &error))) {
         if (b < 0) {
            fprintf (stderr, "Error in json parsing:\n%s\n", error.message);
            abort ();
         }

         if (fwrite (bson_get_data (&doc), 1, doc.len, stdout) != doc.len) {
            fprintf (stderr, "Failed to write to stdout, exiting.\n");
            exit (1);
         }
         bson_reinit (&doc);
      }

      bson_json_reader_destroy (reader);
      bson_destroy (&doc);
   }

   return 0;
}

Examples

The following example reads BSON documents from stdin and prints them to stdout as Json.

/*
 * Copyright 2013 MongoDB, Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */


/*
 * This program will print each BSON document contained in the provided files
 * as a JSON string to STDOUT.
 */


#include <bson.h>
#include <stdio.h>


int
main (int argc, char *argv[])
{
   bson_reader_t *reader;
   const bson_t *b;
   bson_error_t error;
   const char *filename;
   char *str;
   int i;

   /*
    * Print program usage if no arguments are provided.
    */
   if (argc == 1) {
      fprintf (stderr, "usage: %s [FILE | -]...\nUse - for STDIN.\n", argv[0]);
      return 1;
   }

   /*
    * Process command line arguments expecting each to be a filename.
    */
   for (i = 1; i < argc; i++) {
      filename = argv[i];

      if (strcmp (filename, "-") == 0) {
         reader = bson_reader_new_from_fd (STDIN_FILENO, false);
      } else {
         if (!(reader = bson_reader_new_from_file (filename, &error))) {
            fprintf (
               stderr, "Failed to open \"%s\": %s\n", filename, error.message);
            continue;
         }
      }

      /*
       * Convert each incoming document to JSON and print to stdout.
       */
      while ((b = bson_reader_read (reader, NULL))) {
         str = bson_as_json (b, NULL);
         fprintf (stdout, "%s\n", str);
         bson_free (str);
      }

      /*
       * Cleanup after our reader, which closes the file descriptor.
       */
      bson_reader_destroy (reader);
   }

   return 0;
}

Performance Notes

Array Element Key Building

When writing marshaling layers between higher level languages and Libbson, you will eventually need to build keys for array elements. Each element in a BSON array has a monotonic string key like "0", "1", etc. Using snprintf() and others tend to be rather slow on most libc implementations. Therefore, Libbson provides bson_uint32_to_string() to improve this. Using this function allows an internal fast path to be used for numbers less than 1000 which is the vast majority of arrays. If the key is larger than that, a fallback of snprintf() will be used.

char str[16];
const char *key;
uint32_t i;

for (i = 0; i < 10; i++) {
   bson_uint32_to_string (i, &key, str, sizeof str);
   printf ("Key: %s\n", key);
}

For more information, see bson_uint32_to_string().

Cross Platform Notes

Endianness

The BSON specification dictates that the encoding format is in little-endian. Many implementations simply ignore endianness altogether and expect that they are to be run on little-endian. Libbson supports both Big and Little Endian systems. This means we use memcpy() when appropriate instead of dereferencing and properly convert to and from the hoste endian format. We expect the compiler intrinsics to optimize it to a dereference when possible.

Threading

Libbson's data structures are NOT thread-safe. You are responsible for accessing and mutating these structures from one thread at a time.

Libbson requires POSIX threads (pthreads) on all UNIX-like platforms. On Windows, the native threading interface is used. Libbson uses your system's threading library to safely generate unique ObjectIds, and to provide a fallback implementation for atomic operations on platforms without built-in atomics.

API Reference

bson_t

BSON Document Abstraction

Synopsis

#include <bson.h>

BSON_ALIGNED_BEGIN (128)
typedef struct {
   uint32_t flags;       /* Internal flags for the bson_t. */
   uint32_t len;         /* Length of BSON data. */
   uint8_t padding[120]; /* Padding for stack allocation. */
} bson_t BSON_ALIGNED_END (128);

Description

The bson_t structure represents a BSON document. This structure manages the underlying BSON encoded buffer. For mutable documents, it can append new data to the document.

Performance Notes

The bson_t structure attempts to use an inline allocation within the structure to speed up performance of small documents. When this internal buffer has been exhausted, a heap allocated buffer will be dynamically allocated. Therefore, it is essential to call bson_destroy() on allocated documents.

Example

static void
create_on_heap (void)
{
   bson_t *b = bson_new ();

   BSON_APPEND_INT32 (b, "foo", 123);
   BSON_APPEND_UTF8 (b, "bar", "foo");
   BSON_APPEND_DOUBLE (b, "baz", 1.23f);

   bson_destroy (b);
}

bson_context_t

BSON OID Generation Context

Synopsis

#include <bson.h>

typedef enum {
   BSON_CONTEXT_NONE = 0,
   BSON_CONTEXT_THREAD_SAFE = (1 << 0),
   BSON_CONTEXT_DISABLE_HOST_CACHE = (1 << 1),
   BSON_CONTEXT_DISABLE_PID_CACHE = (1 << 2),
#ifdef BSON_HAVE_SYSCALL_TID
   BSON_CONTEXT_USE_TASK_ID = (1 << 3),
#endif
} bson_context_flags_t;

typedef struct _bson_context_t bson_context_t;

bson_context_t *
bson_context_get_default (void) BSON_GNUC_CONST;
bson_context_t *
bson_context_new (bson_context_flags_t flags);
void
bson_context_destroy (bson_context_t *context);

Description

The bson_context_t structure is context for generation of BSON Object IDs. This context allows for specialized overriding of how ObjectIDs are generated based on the applications requirements. For example, disabling of PID caching can be configured if the application cannot detect when a call to fork() has occurred.

Example

#include <bson.h>

int
main (int argc, char *argv[])
{
   bson_context_t *ctx = NULL;
   bson_oid_t oid;

   /* use default context, via bson_context_get_default() */
   bson_oid_init (&oid, NULL);

   /* specify a local context for additional control */
   ctx = bson_context_new (BSON_CONTEXT_DISABLE_PID_CACHE |
                           BSON_CONTEXT_THREAD_SAFE);
   bson_oid_init (&oid, ctx);

   bson_context_destroy (ctx);

   return 0;
}

bson_decimal128_t

BSON Decimal128 Abstraction

Synopsis

#include <bson.h>

typedef struct {
#if BSON_BYTE_ORDER == BSON_LITTLE_ENDIAN
   uint64_t low;
   uint64_t high;
#elif BSON_BYTE_ORDER == BSON_BIG_ENDIAN
   uint64_t high;
   uint64_t low;
#endif
} bson_decimal128_t;

Description

The bson_decimal128_t structure represents the IEEE-754 Decimal128 data type.

Example

#include <bson.h>
#include <stdio.h>

int
main (int argc, char *argv[])
{
   char string[BSON_DECIMAL128_STRING];
   bson_decimal128_t decimal128;

   bson_decimal128_from_string ("100.00", &decimal128);
   bson_decimal128_to_string (&decimal128, string);
   printf ("Decimal128 value: %s\n", string);

   return 0;
}

bson_error_t

BSON Error Encapsulation

Synopsis

#include <bson.h>

typedef struct {
   uint32_t domain;
   uint32_t code;
   char message[504];
} bson_error_t;

Description

The bson_error_t structure is used as an out-parameter to pass error information to the caller. It should be stack-allocated and does not requiring freeing.

See Handling Errors.

Example

bson_reader_t *reader;
bson_error_t error;

reader = bson_reader_new_from_file ("dump.bson", &error);
if (!reader) {
   fprintf (
      stderr, "ERROR: %d.%d: %s\n", error.domain, error.code, error.message);
}

bson_iter_t

BSON Document Iterator

Synopsis

#include <bson.h>

typedef struct {
   /*< private >*/
} bson_iter_t;

Description

bson_iter_t is a structure used to iterate through the elements of a bson_t. It is meant to be used on the stack and can be discarded at any time as it contains no external allocation. The contents of the structure should be considered private and may change between releases, however the structure size will not change.

The bson_t MUST be valid for the lifetime of the iter and it is an error to modify the bson_t while using the iter.

Examples

bson_iter_t iter;

if (bson_iter_init (&iter, my_bson_doc)) {
   while (bson_iter_next (&iter)) {
      printf ("Found a field named: %s\n", bson_iter_key (&iter));
   }
}
bson_iter_t iter;

if (bson_iter_init (&iter, my_bson_doc) && bson_iter_find (&iter, "my_field")) {
   printf ("Found the field named: %s\n", bson_iter_key (&iter));
}
bson_iter_t iter;
bson_iter_t sub_iter;

if (bson_iter_init_find (&iter, my_bson_doc, "mysubdoc") &&
    (BSON_ITER_HOLDS_DOCUMENT (&iter) || BSON_ITER_HOLDS_ARRAY (&iter)) &&
    bson_iter_recurse (&iter, &sub_iter)) {
   while (bson_iter_next (&sub_iter)) {
      printf ("Found key \"%s\" in sub document.\n", bson_iter_key (&sub_iter));
   }
}
bson_iter_t iter;

if (bson_iter_init (&iter, my_doc) &&
    bson_iter_find_descendant (&iter, "a.b.c.d", &sub_iter)) {
   printf ("The type of a.b.c.d is: %d\n", (int) bson_iter_type (&sub_iter));
}

bson_json_reader_t

Bulk Json to BSON conversion

Synopsis

#include <bson.h>

typedef struct _bson_json_reader_t bson_json_reader_t;

typedef enum {
   BSON_JSON_ERROR_READ_CORRUPT_JS = 1,
   BSON_JSON_ERROR_READ_INVALID_PARAM,
   BSON_JSON_ERROR_READ_CB_FAILURE,
} bson_json_error_code_t;

Description

The bson_json_reader_t structure is used for reading a sequence of Json documents and transforming them to bson_t documents.

This can often be useful if you want to perform bulk operations that are defined in a file containing Json documents.

Example

/*
 * Copyright 2013 MongoDB, Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */


/*
 * This program will print each JSON document contained in the provided files
 * as a BSON string to STDOUT.
 */


#include <bson.h>
#include <stdlib.h>
#include <stdio.h>


int
main (int argc, char *argv[])
{
   bson_json_reader_t *reader;
   bson_error_t error;
   const char *filename;
   bson_t doc = BSON_INITIALIZER;
   int i;
   int b;

   /*
    * Print program usage if no arguments are provided.
    */
   if (argc == 1) {
      fprintf (stderr, "usage: %s FILE...\n", argv[0]);
      return 1;
   }

   /*
    * Process command line arguments expecting each to be a filename.
    */
   for (i = 1; i < argc; i++) {
      filename = argv[i];

      /*
       * Open the filename provided in command line arguments.
       */
      if (0 == strcmp (filename, "-")) {
         reader = bson_json_reader_new_from_fd (STDIN_FILENO, false);
      } else {
         if (!(reader = bson_json_reader_new_from_file (filename, &error))) {
            fprintf (
               stderr, "Failed to open \"%s\": %s\n", filename, error.message);
            continue;
         }
      }

      /*
       * Convert each incoming document to BSON and print to stdout.
       */
      while ((b = bson_json_reader_read (reader, &doc, &error))) {
         if (b < 0) {
            fprintf (stderr, "Error in json parsing:\n%s\n", error.message);
            abort ();
         }

         if (fwrite (bson_get_data (&doc), 1, doc.len, stdout) != doc.len) {
            fprintf (stderr, "Failed to write to stdout, exiting.\n");
            exit (1);
         }
         bson_reinit (&doc);
      }

      bson_json_reader_destroy (reader);
      bson_destroy (&doc);
   }

   return 0;
}

bson_md5_t

BSON MD5 Abstraction

Synopsis

typedef struct {
   uint32_t count[2]; /* message length in bits, lsw first */
   uint32_t abcd[4];  /* digest buffer */
   uint8_t buf[64];   /* accumulate block */
} bson_md5_t;

Description

bson_md5_t encapsulates an implementation of the MD5 algorithm. This is used in OID generation for the MD5(hostname) bytes. It is also used by some libraries such as the MongoDB C driver.

bson_oid_t

BSON ObjectID Abstraction

Synopsis

#include <bson.h>

typedef struct {
   uint8_t bytes[12];
} bson_oid_t;

Description

The bson_oid_t structure contains the 12-byte ObjectId notation defined by the BSON ObjectID specification.

ObjectId is a 12-byte BSON type, constructed using:

  • a 4-byte value representing the seconds since the Unix epoch (in Big Endian)
  • a 3-byte machine identifier
  • a 2-byte process id (Big Endian), and
  • a 3-byte counter (Big Endian), starting with a random value.

String Conversion

You can convert an Object ID to a string using bson_oid_to_string() and back with bson_oid_init_from_string().

Hashing

A bson_oid_t can be used in hashtables using the function bson_oid_hash() and bson_oid_equal().

Comparing

A bson_oid_t can be compared to another using bson_oid_compare() for qsort() style comparing and bson_oid_equal() for direct equality.

Validating

You can validate that a string containing a hex-encoded ObjectID is valid using the function bson_oid_is_valid().

Example

#include <bson.h>
#include <stdio.h>

int
main (int argc, char *argv[])
{
   bson_oid_t oid;
   char str[25];

   bson_oid_init (&oid, NULL);
   bson_oid_to_string (&oid, str);
   printf ("%s\n", str);

   if (bson_oid_is_valid (str, sizeof str)) {
      bson_oid_init_from_string (&oid, str);
   }

   printf ("The UNIX time was: %u\n", (unsigned) bson_oid_get_time_t (&oid));

   return 0;
}

bson_reader_t

Streaming BSON Document Reader

Synopsis

#include <bson.h>

typedef struct _bson_reader_t bson_reader_t;

bson_reader_t *
bson_reader_new_from_handle (void *handle,
                             bson_reader_read_func_t rf,
                             bson_reader_destroy_func_t df);
bson_reader_t *
bson_reader_new_from_fd (int fd, bool close_on_destroy);
bson_reader_t *
bson_reader_new_from_file (const char *path, bson_error_t *error);
bson_reader_t *
bson_reader_new_from_data (const uint8_t *data, size_t length);

void
bson_reader_destroy (bson_reader_t *reader);

Description

bson_reader_t is a structure used for reading a sequence of BSON documents. The sequence can come from a file-descriptor, memory region, or custom callbacks.

Example

/*
 * Copyright 2013 MongoDB, Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */


/*
 * This program will print each BSON document contained in the provided files
 * as a JSON string to STDOUT.
 */


#include <bson.h>
#include <stdio.h>


int
main (int argc, char *argv[])
{
   bson_reader_t *reader;
   const bson_t *b;
   bson_error_t error;
   const char *filename;
   char *str;
   int i;

   /*
    * Print program usage if no arguments are provided.
    */
   if (argc == 1) {
      fprintf (stderr, "usage: %s [FILE | -]...\nUse - for STDIN.\n", argv[0]);
      return 1;
   }

   /*
    * Process command line arguments expecting each to be a filename.
    */
   for (i = 1; i < argc; i++) {
      filename = argv[i];

      if (strcmp (filename, "-") == 0) {
         reader = bson_reader_new_from_fd (STDIN_FILENO, false);
      } else {
         if (!(reader = bson_reader_new_from_file (filename, &error))) {
            fprintf (
               stderr, "Failed to open \"%s\": %s\n", filename, error.message);
            continue;
         }
      }

      /*
       * Convert each incoming document to JSON and print to stdout.
       */
      while ((b = bson_reader_read (reader, NULL))) {
         str = bson_as_json (b, NULL);
         fprintf (stdout, "%s\n", str);
         bson_free (str);
      }

      /*
       * Cleanup after our reader, which closes the file descriptor.
       */
      bson_reader_destroy (reader);
   }

   return 0;
}

bson_string_t

String Building Abstraction

Synopsis

#include <bson.h>

typedef struct {
   char *str;
   uint32_t len;
   uint32_t alloc;
} bson_string_t;

Description

bson_string_t is an abstraction for building strings. As chunks are added to the string, allocations are performed in powers of two.

This API is useful if you need to build Utf-8 encoded strings.

Example

bson_string_t *str;

str = bson_string_new (NULL);
bson_string_append_printf (str, "%d %s %f\n", 0, "some string", 0.123);
printf ("%s\n", str->str);

bson_string_free (str, true);

bson_subtype_t

Binary Field Subtype

Synopsis

#include <bson.h>


typedef enum {
   BSON_SUBTYPE_BINARY = 0x00,
   BSON_SUBTYPE_FUNCTION = 0x01,
   BSON_SUBTYPE_BINARY_DEPRECATED = 0x02,
   BSON_SUBTYPE_UUID_DEPRECATED = 0x03,
   BSON_SUBTYPE_UUID = 0x04,
   BSON_SUBTYPE_MD5 = 0x05,
   BSON_SUBTYPE_USER = 0x80,
} bson_subtype_t;

Description

This enumeration contains the various subtypes that may be used in a binary field. See http://bsonspec.org for more information.

Example

bson_t doc = BSON_INITIALIZER;

BSON_APPEND_BINARY (&doc, "binary", BSON_SUBTYPE_BINARY, data, data_len);

bson_type_t

BSON Type Enumeration

Synopsis

#include <bson.h>

typedef enum {
   BSON_TYPE_EOD = 0x00,
   BSON_TYPE_DOUBLE = 0x01,
   BSON_TYPE_UTF8 = 0x02,
   BSON_TYPE_DOCUMENT = 0x03,
   BSON_TYPE_ARRAY = 0x04,
   BSON_TYPE_BINARY = 0x05,
   BSON_TYPE_UNDEFINED = 0x06,
   BSON_TYPE_OID = 0x07,
   BSON_TYPE_BOOL = 0x08,
   BSON_TYPE_DATE_TIME = 0x09,
   BSON_TYPE_NULL = 0x0A,
   BSON_TYPE_REGEX = 0x0B,
   BSON_TYPE_DBPOINTER = 0x0C,
   BSON_TYPE_CODE = 0x0D,
   BSON_TYPE_SYMBOL = 0x0E,
   BSON_TYPE_CODEWSCOPE = 0x0F,
   BSON_TYPE_INT32 = 0x10,
   BSON_TYPE_TIMESTAMP = 0x11,
   BSON_TYPE_INT64 = 0x12,
   BSON_TYPE_MAXKEY = 0x7F,
   BSON_TYPE_MINKEY = 0xFF,
} bson_type_t;

Description

The bson_type_t enumeration contains all of the types from the BSON Specification. It can be used to determine the type of a field at runtime.

Example

bson_iter_t iter;

if (bson_iter_init_find (&iter, doc, "foo") &&
    (BSON_TYPE_INT32 == bson_iter_type (&iter))) {
   printf ("'foo' is an int32.\n");
}

bson_uint32_to_string()

Synopsis

size_t
bson_uint32_to_string (uint32_t value,
                       const char **strptr,
                       char *str,
                       size_t size);

See Array Element Key Building for example usage.

Parameters

  • value: A uint32_t.
  • strptr: A location for the resulting string pointer.
  • str: A location to buffer the string.
  • size: A size_t containing the size of str.

Description

Converts value to a string.

If value is from 0 to 999, it will use a constant string in the data section of the library.

If not, a string will be formatted using str and snprintf().

strptr will always be set. It will either point to str or a constant string. Use this as your key.

Returns

The number of bytes in the resulting string.

bson_unichar_t

Unicode Character Abstraction

Synopsis

typedef uint32_t bson_unichar_t;

Description

bson_unichar_t provides an abstraction on a single unicode character. It is the 32-bit representation of a character. As Utf-8 can contain multi-byte characters, this should be used when iterating through Utf-8 text.

Example

static void
print_each_char (const char *str)
{
   bson_unichar_t c;

   for (; *str; str = bson_utf8_next_char (str)) {
      c = bson_utf8_get_char (str);
      printf ("The numberic value is %u.\n", (unsigned) c);
   }
}

bson_value_t

BSON Boxed Container Type

Synopsis

#include <bson.h>

typedef struct _bson_value_t {
   bson_type_t value_type;
   union {
      bson_oid_t v_oid;
      int64_t v_int64;
      int32_t v_int32;
      int8_t v_int8;
      double v_double;
      bool v_bool;
      int64_t v_datetime;
      struct {
         uint32_t timestamp;
         uint32_t increment;
      } v_timestamp;
      struct {
         uint32_t len;
         char *str;
      } v_utf8;
      struct {
         uint32_t data_len;
         uint8_t *data;
      } v_doc;
      struct {
         uint32_t data_len;
         uint8_t *data;
         bson_subtype_t subtype;
      } v_binary;
      struct {
         char *regex;
         char *options;
      } v_regex;
      struct {
         char *collection;
         uint32_t collection_len;
         bson_oid_t oid;
      } v_dbpointer;
      struct {
         uint32_t code_len;
         char *code;
      } v_code;
      struct {
         uint32_t code_len;
         char *code;
         uint32_t scope_len;
         uint8_t *scope_data;
      } v_codewscope;
      struct {
         uint32_t len;
         char *symbol;
      } v_symbol;
   } value;
} bson_value_t;

Description

The bson_value_t structure is a boxed type for encapsulating a runtime determined type.

Example

const bson_value_t *value;

value = bson_iter_value (&iter);

if (value->value_type == BSON_TYPE_INT32) {
   printf ("%d\n", value->value.v_int32);
}

bson_visitor_t

Synopsis

#include <bson.h>

typedef struct {
   /* run before / after descending into a document */
   bool (*visit_before) (const bson_iter_t *iter, const char *key, void *data);
   bool (*visit_after) (const bson_iter_t *iter, const char *key, void *data);
   /* corrupt BSON, or unsupported type and visit_unsupported_type not set */
   void (*visit_corrupt) (const bson_iter_t *iter, void *data);
   /* normal bson field callbacks */
   bool (*visit_double) (const bson_iter_t *iter,
                         const char *key,
                         double v_double,
                         void *data);
   bool (*visit_utf8) (const bson_iter_t *iter,
                       const char *key,
                       size_t v_utf8_len,
                       const char *v_utf8,
                       void *data);
   bool (*visit_document) (const bson_iter_t *iter,
                           const char *key,
                           const bson_t *v_document,
                           void *data);
   bool (*visit_array) (const bson_iter_t *iter,
                        const char *key,
                        const bson_t *v_array,
                        void *data);
   bool (*visit_binary) (const bson_iter_t *iter,
                         const char *key,
                         bson_subtype_t v_subtype,
                         size_t v_binary_len,
                         const uint8_t *v_binary,
                         void *data);
   /* normal field with deprecated "Undefined" BSON type */
   bool (*visit_undefined) (const bson_iter_t *iter,
                            const char *key,
                            void *data);
   bool (*visit_oid) (const bson_iter_t *iter,
                      const char *key,
                      const bson_oid_t *v_oid,
                      void *data);
   bool (*visit_bool) (const bson_iter_t *iter,
                       const char *key,
                       bool v_bool,
                       void *data);
   bool (*visit_date_time) (const bson_iter_t *iter,
                            const char *key,
                            int64_t msec_since_epoch,
                            void *data);
   bool (*visit_null) (const bson_iter_t *iter, const char *key, void *data);
   bool (*visit_regex) (const bson_iter_t *iter,
                        const char *key,
                        const char *v_regex,
                        const char *v_options,
                        void *data);
   bool (*visit_dbpointer) (const bson_iter_t *iter,
                            const char *key,
                            size_t v_collection_len,
                            const char *v_collection,
                            const bson_oid_t *v_oid,
                            void *data);
   bool (*visit_code) (const bson_iter_t *iter,
                       const char *key,
                       size_t v_code_len,
                       const char *v_code,
                       void *data);
   bool (*visit_symbol) (const bson_iter_t *iter,
                         const char *key,
                         size_t v_symbol_len,
                         const char *v_symbol,
                         void *data);
   bool (*visit_codewscope) (const bson_iter_t *iter,
                             const char *key,
                             size_t v_code_len,
                             const char *v_code,
                             const bson_t *v_scope,
                             void *data);
   bool (*visit_int32) (const bson_iter_t *iter,
                        const char *key,
                        int32_t v_int32,
                        void *data);
   bool (*visit_timestamp) (const bson_iter_t *iter,
                            const char *key,
                            uint32_t v_timestamp,
                            uint32_t v_increment,
                            void *data);
   bool (*visit_int64) (const bson_iter_t *iter,
                        const char *key,
                        int64_t v_int64,
                        void *data);
   bool (*visit_maxkey) (const bson_iter_t *iter, const char *key, void *data);
   bool (*visit_minkey) (const bson_iter_t *iter, const char *key, void *data);
   /* if set, called instead of visit_corrupt when an apparently valid BSON
    * includes an unrecognized field type (reading future version of BSON) */
   void (*visit_unsupported_type) (const bson_iter_t *iter,
                                   const char *key,
                                   uint32_t type_code,
                                   void *data);
   bool (*visit_decimal128) (const bson_iter_t *iter,
                             const char *key,
                             const bson_decimal128_t *v_decimal128,
                             void *data);

   void *padding[7];
} bson_visitor_t bson_visitor_t;

Description

The bson_visitor_t structure provides a series of callbacks that can be called while iterating a BSON document. This may simplify the conversion of a bson_t to a higher level language structure.

If the optional callback visit_unsupported_type is set, it is called instead of visit_corrupt in the specific case of an unrecognized field type. (Parsing is aborted in either case.) Use this callback to report an error like "unrecognized type" instead of simply "corrupt BSON". This future-proofs code that may use an older version of libbson to parse future BSON formats.

Example

#include <bson.h>
#include <stdio.h>

static bool
my_visit_before (const bson_iter_t *iter, const char *key, void *data)
{
   int *count = (int *) data;

   (*count)++;

   /* returning true stops further iteration of the document */

   return false;
}

static void
count_fields (bson_t *doc)
{
   bson_visitor_t visitor = {0};
   bson_iter_t iter;
   int count = 0;

   visitor.visit_before = my_visit_before;

   if (bson_iter_init (&iter, doc)) {
      bson_iter_visit_all (&iter, &visitor, &count);
   }

   printf ("Found %d fields.\n", count);
}

bson_writer_t

Bulk BSON serialization Abstraction

Synopsis

#include <bson.h>

typedef struct _bson_writer_t bson_writer_t;

bson_writer_t *
bson_writer_new (uint8_t **buf,
                 size_t *buflen,
                 size_t offset,
                 bson_realloc_func realloc_func,
                 void *realloc_func_ctx);
void
bson_writer_destroy (bson_writer_t *writer);

Description

The bson_writer_t API provides an abstraction for serializing many BSON documents to a single memory region. The memory region may be dynamically allocated and re-allocated as more memory is demanded. This can be useful when building network packets from a high-level language. For example, you can serialize a Python Dictionary directly to a single buffer destined for a TCP packet.

Example

#include <bson.h>

int
main (int argc, char *argv[])
{
   bson_writer_t *writer;
   uint8_t *buf = NULL;
   size_t buflen = 0;
   bson_t *doc;

   writer = bson_writer_new (&buf, &buflen, 0, bson_realloc_ctx, NULL);

   for (i = 0; i < 1000; i++) {
      bson_writer_begin (writer, &doc);
      BSON_APPEND_INT32 (&doc, "i", i);
      bson_writer_end (writer);
   }

   bson_writer_destroy (writer);

   bson_free (buf);

   return 0;
}

System Clock

BSON Clock Abstraction

Synopsis

int64_t
bson_get_monotonic_time (void);
int
bson_gettimeofday (struct timeval *tv,
                   struct timezone *tz);

Description

The clock abstraction in Libbson provides a cross-platform way to handle timeouts within the BSON library. It abstracts the differences in implementations of gettimeofday() as well as providing a monotonic (incrementing only) clock in microseconds.

Memory Management

BSON Memory Abstraction.

Description

Libbson contains a lightweight memory abstraction to make portability to new platforms easier. Additionally, it helps us integrate with interesting higher-level languages. One caveat, however, is that Libbson is not designed to deal with Out of Memory (OOM) situations. Doing so requires extreme dilligence throughout the application stack that has rarely been implemented correctly. This may change in the future. As it stands now, Libbson will abort() under OOM situations.

To aid in language binding integration, Libbson allows for setting a custom memory allocator via bson_mem_set_vtable().  This allocation may be reversed via bson_mem_restore_vtable().

Libbson Versioning

Versioning Macros and Functions

Macros

The following preprocessor macros can be used to perform various checks based on the version of the library you are compiling against. This may be useful if you only want to enable a feature on a certain version of the library.

Synopsis

#define BSON_CHECK_VERSION(major, minor, micro)

#define BSON_MAJOR_VERSION (1)
#define BSON_MINOR_VERSION (4)
#define BSON_MICRO_VERSION (1)
#define BSON_VERSION_S "1.4.1"

#define BSON_VERSION_HEX                                  \
   (BSON_MAJOR_VERSION << 24 | BSON_MINOR_VERSION << 16 | \
    BSON_MICRO_VERSION << 8)

Only compile a block on Libbson 1.1.0 and newer.

#if BSON_CHECK_VERSION(1, 1, 0)
static void
do_something (void)
{
}
#endif

Author

MongoDB, Inc

Info

Mar 28, 2017 1.6.2 Libbson