Working with JSON Schema in CouchDB

I would ask about good practices in JSON schemas in CouchDB. I am currently using pure CouchDB 1.6.1. I handle this without any couchapp framework (I know this is useful, but I'm worried it will be functional in the future).

  • Where does the schema fit in CouchDB? How is a normal document? Project document? Or maybe save them as a file? But if I were to validate them, especially on the server side in the validate_doc_update function, they should be stored in the design docs.

  • Is there any library (JavaScript would be better) with works in CouchDB and Client (web browser)? Library with I could generate JSONs and validate them automatically?

  • I am thinking about how to send data to the client, store it in input tags, and then collect it somehow and send it to serwer. Maybe put the input id as the path to the field, like:

    {"Address": {"Street": "xxx", "Nr": "33"}}

In this case, the input can have id = "Address". "Street", but I do not know, this is a good decision. I have to send a schema from the server and construct a JSON object using that schema, but I don't know how (in case all fields in JSON had unique names, including hierarchies).

+3


source to share


3 answers


You are asking the same question that I have had for many years, exploring the potential benefits of CouchDB in data-redundant forms.

Initially, I hoped to find an approach that allows validating data based on the same JSON schema code and validator code - server and client. It turned out that this is not only possible, but also some additional benefits.

Where does the schema fit in CouchDB? How is a normal document? Project document? Or maybe save them as a file? But if I am going to validate them, especially on the server side in the validate_doc_update function, they should be stored in the design documents.

You're right. The doc (ddoc) project, which also includes a validate_doc_update function to perform validation before updating the doc, is the most common place to put these schemas. this

in the validate_doc_update function is the ddoc itself - anything included in the ddoc can be accessed from the validation code.

I started storing schemas as a JSON object in my shared library property / folder for commonjs modules eg. lib/schemata.json

... The property of type

my documents specified the schema key that should receive confirmation of document verification. type: 'adr'

β†’ lib/schemata/adr

. A schema can also refer to other schemas for each property - the recursive validation function went through to the end of any property no matter what type of nested properties it was. He performed well in the first project.

{
  "person": {
    "name": "/type/name",
    "adr": "/type/adr",
    ...
  },
  "name": {
    "forname": {
      "minlenght": 2,
      "maxlength": 42,
      ...
    },
    "surname": {
      ...
    }
  },
  "adr": {
    ...
  }
}

      

But then I wanted to use a subset of these circuits in another project. It would be too short-sighted thinking to just copy it and add / remove some schematics. What if the general schema for the address has a bug and needs updating in every project it uses?

At this point, my schemas were saved in one file in the repository (I am using erica as a download tool for ddocs). Then I realized that when I store each schema in a separate file, eg. adr.json

, geo.json

, tel.json

Etc. this results in the same JSON structure on ddoc servers as before, with a single-user approach. But it was more suitable for source control. Not only did smaller files lead to less merge conflicts and cleaner commit history, schema dependency management through subrepositories (submodules) was also included.

Another thought was to use CouchDB as a place to store and manage schemas. But as you already mentioned, the schemas should be available in the validate_doc_update function. I first tried the update handler approach - every doc update should pass a validation update handler, which itself gets the correct schema from CouchDB:

POST /_design/validator/_update/doctype/person

function (schema, req) {
   ... //validate req.body against schema "person"
  return [req.body, {code: 202, headers: ...}]
}

      

But this approach doesn't work well with nested schemas. Worse - to prevent doc updates without checking through the handler, I had to use a proxy in front of CouchDB to hide the direct built-in doc update paths (e.g. POST to / the / doc / _id). I have not found a way to detect in the validate_doc_update function whether the update handler was involved before or not (maybe someone has it? I would be happy to read a solution like this.).



During this research, an issue with different versions of the same circuit appears on my radar. How should I do it? Should all documents from the same type be valid against the same version of the schema (which means the need to migrate data in db format before almost every version of the schema)? Should the type property also contain a version number? and etc.

But wait! What if the document outline is attached to the document itself? It:

  • will provide a compatible version for document content for each document
  • be available in the validate_doc_update function (in oldDoc

    )
  • can be replicated without admin access (as you need for ddoc updates)
  • will be included in every response for client side request

This sounded very interesting and it seems to me that this is the very approach of CouchDB-ish so far. To say this clearly - the document outline is attached to the document itself - means its preservation in the ownership of the document. As a repository, both nesting and using the schema itself as the structure of the document have not been successful.

The most sensitive aspect of this approach is C (create) in the life circle of a CRUD document. There are many different solutions you can imagine to ensure that the attached circuit is "correct and acceptable". But that depends on your definition of these terms in your specific project.

Is there any library (JavaScript would be better) with works in CouchDB and Client (web browser)? A library with I could generate JSON and validate them automatically?

I started implementing with the popular jQuery validation plugin . I could use the schema as configuration and validate automatically on the client side. On the server side, I extracted the validation functions as a commonjs module. I expected to find a modular way to manage code later that would prevent code duplication.

It turned out that most of the existing validation frameworks are very good at pattern matching and single-property validation, but not capable of checking dependency values ​​in a single document. Also, the schema definition requirements are often too proprietary. The correct rule for me to choose the correct schema definition is: prefer a standardized definition (jsonschema.org, microdata, rdfa, hcard, etc.) over your own implementation. If you leave the structure and property names as they are, you need less documentation, less transformation, and sometimes you get compatibility with foreign software that your users also use (eg calendars, address books, etc.) Automatically. If you want to implement HTML presentation for your documents,you are well equipped to do it in Semantic Web and SEO Mode.

And finally - without wishing to sound arrogant - writing a validation scheme is not difficult. You might want to read the source code of the jQuery validation plugin - I'm sure you find it, like me, amazing. At a time when the churn rate of front-end frameworks is increasing, this is arguably the most reliable way to ensure your own validation. Also, I believe that you should have a 100% understanding of the validation implementation - this is a critical part of your application. And if you understand the foreign implementation, you can also write the library yourself.

Ok. This is an easy answer. I'm sorry. If anyone reads this to the end and wants to see it in detail in action with an example source code - upvote and I'll write a blog post and add the URI as a comment.

+5


source


I will tell you how I implement it.

  • I have a database for each document type, which allows me to implement one schema for each database.

  • In each database, I have a _design/schema

    ddoc that contains a schema and a validate_doc_update

    function to check it.

  • I am using Tiny Validator (for v4 JSON Schema) which I include right in _design/schema

    ddoc.


_design/schema

ddoc looks like this:

{
  "_id": "_design/schema",
  "libs": {
    "tv4": // Code from https://raw.githubusercontent.com/geraintluff/tv4/master/tv4.min.js
  },
  "validate_doc_update": "..."
  "schema": {
    "title": "Blog",
    "description": "A document containing a single blog post.",
    "type": "object",
    "required": ["title", "body"],
    "properties": {
      "_id": {
        "type": "string"
      },
      "_rev": {
        "type": "string"
      },
      "title": {
        "type": "string"
      },
      "body": {
        "type": "string"
      }
    }
  }
}

      




validate_doc_update

the function looks like this:

function(newDoc) {
  if (newDoc['_deleted']) return;

  var tv4 = require('libs/tv4');

  if (!tv4.validate(newDoc, this.schema)) {
    throw({forbidden: tv4.error.message + ' -> ' + tv4.error.dataPath});
  }
}

      

Hope it helps.

+3


source


Perhaps the best option is to use json-schema . You have implementations in many languages . I have used tv4 in javascript successfully .

In order to integrate with coub db, I think the best option is to define a validation function and use the javascript json-schema validation engine.

0


source







All Articles