This blog is dedicated to open, interoperable manufacturing software and the coolest, latest and greatest things I see every day while conducting business under the banner of Inductive Automation.

Hello, my name is Steve Hechtman and I am president of Inductive Automation. During the span of one day there is more excitement, more discovery than I can possibly keep to myself. This blog is, therefore, my outlet. WARNING: This site is highly biased in favor of the most powerful, affordable manufacturing software in the world - Ignition by Inductive Automation!

SQL vs. NoSQL in Automation

Most people know what SQL is. Commercial implementations of the SQL language have been around since about 1983. Today SQL database servers are available from Oracle, Microsoft, IBM and various open-source organizations.  

As an analogy, I like to think of SQL databases as multi-user spreadsheet servers which use SQL commands to manipulate the spreadsheets. But rather than being spreadsheets these become tables in a SQL database.

SQL commands are known as queries.  In response to a query, the database returns a result set, which is just a list of rows containing the answers. The simplest query would return all the rows from a table, but more often, the rows are filtered in certain ways to return just the answer wanted.  Data from multiple tables are often combined into one result by using a "join" query.

This gives us the relational aspect of SQL databases in that different data from different tables can be related to each other.  For example, "Return the names of all level 3 operators who work in Riverside on Line 5 and have an average line efficiency of greater than 80% for the past 12 months."  

NoSQL refers to an emerging class of databases which are loosely known as "Big Data" type databases.  Examples are CouchDB and MongoDB. 

What's the Difference?
I downloaded and installed both just to experiment with.  The first thing I discovered is that these aren't just "plug and play" or "fill in the blanks" configuration like most SQL databases are.  Each has a different API to program against and both require the use of C++, Java, C# or various other programming languages to make them do anything at all.  In other words, each is intended to be integrated into other products, not by standardized connectors such as ODBC or JDBC, but by hard core programming.

Missing are the usual SQL database front-end tools by which you can view the contents of tables, run test queries, etc.  Rather than using tables as in SQL, NoSQL databases use the concept of "documents."  Documents contain name:value pairs where the value can contain anything at all.  API methods are available to insert and manipulate documents and name:value pairs.  There is no concept of an SQL database "join."  

SQL database administrators are continually challenged by the questions, "How do I scale out the size of my database?", "How can I do this without service interruptions?" and  "How can I reliably back up my database without interruptions?"

NoSQL databases address these problems because they can be scaled horizontally over dozens of machines (or more) without interrupting service.  Multiple copies of databases can be scaled across numerous machines to provide redundancy.  The amazing thing about this is how easily it's done.  What I saw is that  nothing more than simple configuration is required.  Doing the same with SQL databases is beyond challenging.

NoSQL databases arose out of the challenges faced by Google, Amazon and Facebook as their empires grew.  They had significantly different challenges in dealing with huge quantities of data that the traditional SQL database solutions could not cope with. 

Solutions to Two Different Problems
SQL and NoSQL are two different paradigms, each of which address different problems.  Each has their pros and cons.  Like everything in our industry, every tool has a purpose and using the wrong tool for the wrong job wreaks havoc.

So which suits the MES/controls industry best?  It's important to remember that in our industry maintainability is everything.  So are the standards that make inter-connectivity possible.  You will have to connect with existing databases to make viable MES systems.  Outside of the online services such as Twitter, Facebook, and Google you won't find NoSQL databases in use.  So why worry about them?  Are you going to write a custom programs that integrate NoSQL for customer applications?  I hope not because you'll be the only person in the world who can ever support them. 

Here's a big exception.  Are you going to develop a new product to sell that integrates NoSQL?  Perhaps a data historian?  In that case, if the performance proves out, you'd be on firm ground.  But you'll still have to develop an industry standard interface for it such as OPC-HA before anyone would buy.  

1 comment:

Nathan Boeger said...

Interesting perspective! I'm going to do more research into this technology set. Obviously, I would recommend waiting to see how it pans out before jumping on the bandwagon.