SQL for MongoDB

Zero-SQL

The development of our AST-based SQL to MongoDB pipeline converter represents a pivotal step in making MongoDB more accessible and versatile for users accustomed to SQL. This innovation not only enhances the functionality of our Query Editor but also positions us at the forefront of database technology solutions, empowering developers with greater flexibility and control over their data workflows.

Key Features

  1. AST-based Conversion: Our converter utilizes Abstract Syntax Trees (ASTs) to adeptly manage SQL syntax intricacies. This facilitates a more accurate translation of SQL queries into MongoDB pipeline stages.

  2. Compatibility with SQL: Designed with SQL developers in mind, the converter enables MongoDB to execute queries akin to traditional SQL, making it more accessible to those familiar with SQL's syntax and logic.

  3. Complex Query Handling: The system is engineered to support complex query structures, ensuring that users can leverage the full power of SQL queries within a MongoDB framework without compromising on functionality.

  4. Implemented in Pure Go: The choice of Go as the implementation language underscores our commitment to performance and efficiency. Go's concurrency features and robustness provide a solid foundation for the converter, ensuring reliable and quick processing of queries.

Usage Benefits

  • Familiar Environment for SQL Developers: By enabling SQL-style querying, developers versed in SQL can transition to MongoDB with minimal learning curve, enhancing productivity and reducing onboarding time.

  • Enhanced Query Capabilities: Users can perform intricate data operations within MongoDB, leveraging the full potential of SQL-style querying, which can unlock new data insights and efficiencies.

  • Seamless Integration: The converter is integrated into our Query Editor, providing users with a seamless experience when crafting and executing queries. The familiar SQL interface makes for a smoother transition and more intuitive data manipulation.

ZERO-BANNER

Features

Zero-SQL supports a comprehensive set of SQL features:

  • SELECT statements with column selection and aliases

  • FROM clauses with table references

  • JOIN operations (INNER, LEFT, RIGHT)

  • WHERE clauses with complex conditions (AND, OR, nested conditions)

  • Comparison operators (=, !=, >, <, >=, <=, LIKE, ILIKE, IN, NOT IN)

  • NULL checks (IS NULL, IS NOT NULL)

  • ORDER BY clauses with ASC/DESC

  • LIMIT and OFFSET

  • GROUP BY with aggregation functions (COUNT, SUM, AVG, MIN, MAX)

  • HAVING clauses

  • Case-insensitive pattern matching with ILIKE

Examples

Simple SELECT with WHERE

SELECT name, email FROM users WHERE active = true

This means:

[
  {
    "$match": {
      "active": true
    }
  },
  {
    "$project": {
      "_id": 0,
      "name": "$name",
      "email": "$email"
    }
  }
]

Quoted Column Names

Zero-SQL fully supports quoted identifiers for column names with spaces or special characters:

SELECT "User Name", "Email Address" FROM users WHERE "Created Date" > '\''2023-01-01'\''

JOIN Operations

Simple INNER JOIN

SELECT u.name, p.title FROM users u JOIN posts p ON u.id = p.user_id

This means:

[
  {
    "$lookup": {
      "from": "posts",
      "localField": "id",
      "foreignField": "user_id",
      "as": "p"
    }
  },
  {
    "$unwind": "$p"
  },
  {
    "$project": {
      "_id": 0,
      "name": "$u.name",
      "title": "$p.title"
    }
  }
]

LEFT JOIN

SELECT u.name, p.title FROM users u LEFT JOIN posts p ON u.id = p.user_id

This means:

[
  {
    "$lookup": {
      "from": "posts",
      "localField": "id",
      "foreignField": "user_id",
      "as": "p"
    }
  },
  {
    "$unwind": {
      "path": "$p",
      "preserveNullAndEmptyArrays": true
    }
  },
  {
    "$project": {
      "_id": 0,
      "name": "$u.name",
      "title": "$p.title"
    }
  }
]

Multiple JOINs

SELECT u.name, p.title, c.name as category FROM users u 
JOIN posts p ON u.id = p.user_id 
JOIN categories c ON p.category_id = c.id

This means:

[
  {
    "$lookup": {
      "from": "posts",
      "localField": "id",
      "foreignField": "user_id",
      "as": "p"
    }
  },
  {
    "$unwind": "$p"
  },
  {
    "$lookup": {
      "from": "categories",
      "localField": "category_id",
      "foreignField": "id",
      "as": "c"
    }
  },
  {
    "$unwind": "$c"
  },
  {
    "$project": {
      "_id": 0,
      "category": "$c.name",
      "name": "$u.name",
      "title": "$p.title"
    }
  }
]

GROUP BY with Aggregation

SELECT status, COUNT(*) as total FROM orders GROUP BY status

This means:

[
  {
    "$group": {
      "_id": {
        "status": "$status"
      },
      "total": {
        "$sum": 1
      }
    }
  }
]

Complex WHERE Conditions

SELECT * FROM products WHERE (price > 100 AND category = 'electronics') 
OR (price < 50 AND category = 'books')

LIKE and Pattern Matching

SELECT name FROM users WHERE email ILIKE '%@gmail.com'

ORDER BY and LIMIT

SELECT name, created_at FROM users ORDER BY created_at DESC LIMIT 10

Complex Investment Analysis Example

SELECT _id, "Date", "Description", "Operation", SUM("Amount") AS Total_Amount, COUNT(_id) AS Investment_Count, AVG("Amount") AS Average_Investment_Amount FROM investments WHERE "Date" BETWEEN '\''2023-01-01'\'' AND '\''2023-12-31'\'' AND "Operation" IN ('\''Deposit'\'', '\''Withdrawal'\'') GROUP BY _id, "Date", "Description", "Operation" ORDER BY Total_Amount DESC LIMIT 50;

This means:

[
  {
    "$match": {
      "$and": [
        {
          "Date": {
            "$gte": "2023-01-01",
            "$lte": "2023-12-31"
          }
        },
        {
          "Operation": {
            "$in": ["Deposit", "Withdrawal"]
          }
        }
      ]
    }
  },
  {
    "$group": {
      "Average_Investment_Amount": {"$avg": "$Amount"},
      "Investment_Count": {
        "$sum": {
          "$cond": [{"$ne": ["$_id", null]}, 1, 0]
        }
      },
      "Total_Amount": {"$sum": "$Amount"},
      "_id": {
        "Date": "$Date",
        "Description": "$Description", 
        "Operation": "$Operation",
        "_id": "$_id"
      }
    }
  },
  {
    "$sort": {"Total_Amount": -1}
  },
  {
    "$limit": 50
  }
]

Supported SQL Features

SELECT Clause

  • Column selection: SELECT name, age

  • Wildcard: SELECT *

  • Column aliases: SELECT name AS full_name

  • Aggregation functions: COUNT(), SUM(), AVG(), MIN(), MAX()

FROM Clause

  • Table references: FROM users

  • Table aliases: FROM users u

JOIN Clause

  • INNER JOIN: INNER JOIN posts ON users.id = posts.user_id

  • LEFT JOIN: LEFT JOIN posts ON users.id = posts.user_id

  • Table aliases in JOINs: FROM users u JOIN posts p ON u.id = p.user_id

  • Multiple JOINs: FROM users u JOIN posts p ON u.id = p.user_id JOIN categories c ON p.category_id = c.id

  • JOINs with WHERE conditions: FROM users u JOIN posts p ON u.id = p.user_id WHERE u.active = true

WHERE Clause

  • Comparison operators: =, !=, <>, >, <, >=, <=

  • Pattern matching: LIKE, ILIKE

  • List operations: IN, NOT IN

  • NULL checks: IS NULL, IS NOT NULL

  • Logical operators: AND, OR, NOT

  • Parentheses for grouping: (condition1 OR condition2) AND condition3

GROUP BY Clause

  • Single column: GROUP BY status

  • Multiple columns: GROUP BY category, status

  • Works with aggregation functions in SELECT

HAVING Clause

  • Filter aggregated results: HAVING COUNT(*) > 5

  • Supports same operators as WHERE clause

ORDER BY Clause

  • Ascending: ORDER BY name or ORDER BY name ASC

  • Descending: ORDER BY created_at DESC

  • Multiple columns: ORDER BY category, name DESC

LIMIT and OFFSET

  • Limit results: LIMIT 10

  • Skip results: OFFSET 20 or LIMIT 20, 10

MongoDB Output

`zero-sql` generates MongoDB aggregation pipelines using these stages:

  • $lookup - for JOIN operations

  • $unwind - to flatten joined arrays

  • $match - for WHERE and HAVING clauses

  • $group - for GROUP BY clauses

  • $project - for SELECT column specification

  • $sort - for ORDER BY clauses

  • $skip - for OFFSET

  • $limit - for LIMIT

Error Handling

Zero-SQL provides detailed error messages for:

  • Invalid SQL syntax

  • Unsupported SQL features

  • Type mismatches

  • Missing required clauses

Limitations

Current limitations include:

  • Only SELECT statements are supported

  • Subqueries are not yet supported

  • Window functions are not supported

  • Some advanced SQL features may not be available

Contributing

Contributions are welcome! Please see the Contributing Guide for details. https://github.com/SyneHQ/zero-sql

Architecture

The converter package is the heart of the application, handling the conversion from SQL Abstract Syntax Trees (AST) to MongoDB aggregation pipelines.

Troubleshooting

MongoDB Aggregation Namespace Error

If you encounter an error like (InvalidNamespace) {aggregate: 1} is not valid for '$limit'; a collection is required, this means that when executing the generated aggregation pipeline in MongoDB, the collection name is not being specified correctly.

Then, in your MongoDB client/driver, use the collection name when executing the aggregation:

// MongoDB shell
db.users.aggregate([{"$limit": 10}])

// Node.js with MongoDB driver
await db.collection("users").aggregate([{"$limit": 10}]).toArray()

Last updated