Microsoft “Oslo” MGraph – the next XML?

Microsoft’s upcoming “Oslo” modeling initiative is about tools and languages. MGraph is the piece within the language “M” that defines values, while MSchema is for schemas, functions and constraints and MGrammar is for textual DSLs. “Oslo” is still CTP and it will take some time until all concepts are available for production use.

By then, Microsoft plans to publish an open specification, such that everyone who wants to can implement the “M” language. Their ambition is to make it be as broad as XML is today.

Anyone can implement it. We want this approach wide spread on a variety of different platforms. We want it to be as broad as Xml is today.

Douglas Purdy, A Lap around “Oslo”

Today we have lots of XML on the wire. There’s also lots of JSON.

I want to see MGraph get there as well.

Jeff Pinkston alias Pinky:-), MGraph as a data rep

What is MGraph?

MGraph at it’s base is not a language. It is a simple contract for storing structured data in form of a labeled directed graph. This is a set of nodes where each node has an optional label and a set of successors, each of which may be a node or any other object.

The idea behind that is, that every structure can be exposed as a MGraph just by implementing the following interface, which is the core API behind MGraph.

I added some comments for easier understanding.

// Exposes whatever structure as an MGraph
public interface IGraphBuilder
{
  // Checks, if the object submitted is a value or a (custom) node.
  bool IsNode(object value);
  // Retrieves a comparer for the (custom) node objects.
  IEqualityComparer NodeComparer { get; }
  // Extracts the label from a (custom) node.
  object GetLabel(object node);
  // Extracts the successors from a (custom) node.
  IEnumerable GetSuccessors(object node);
  // Gets a (custom) node with a label.
  object DefineNode(object label);
  // Sets successors to a (custom) node.
  void DefineSuccessors(object protoNode, IEnumerable successors);
}

As you see, even the node is not specified by an interface or a base class. Both labels and successors are extracted by the visiting graph builder.

If you want, read more about the MGraph Api here.

MGraph as a language

What I do care more about is how MGraph’s textual notation looks like and how it compares to XML.

On MSDN you can find a language specification covering MSchema and MGrammar which both use parts of MGraph, but in a slightly different manner. Microsoft definitely plans to bring those pieces together

Today MGraph is used for values in MSchema extent initializers as well as for the AST (Abstract Syntax Tree) productions in MGrammar.

The basic syntax of MGraph very similar to JSON:

label { 
  otherLabel { "value" },
  "value",
},
"value"

As mentioned previously a successor can be either a node or a value. A value is just written directly, while a node is split into a label followed by its comma-separated successors within curly braces.

The same data as a XML-fragment would look like this:


  value
  value

value

One of the major differences is, that MGraph doesn’t distinguish attributes and elements. As XML is used today, anyone use attributes and elements according to their personal taste anyway.

Typed values

The next great difference is, that values are not just strings, but typed. Some of the intrinsic types are Text, Logical or Number.

mynode
{
  text { "some text" },
  number { 1234 }
  logical { true }
}

Find a list of the supported types in chapter 3.5 Intrinsic Types in the “Oslo” Modeling Language Specification.

Escaped Labels

While XML-elements and attributes are restricted to QName, a label in MGraph can be any object. They way how this is expressed in the textual syntax is not finished yet, but in MGrammar productions more complex strings are defined with an id-function.

id("some label xyz") 
{
  true, 
  id("another node") { "value" }
}

Ordered and unordered successors

In order to make a mapping to relational structures easier, successors are not sorted by default. In order to sort the successors, they have to be encapsulated in an integer-labeled node.

{
  0 { "value1" },
  1 { "value2" }
}

Which alternatively also can be expressed by brackets instead of braces. In the “M” jargon this is called a sequence.

[
  "value1",
  "value2"
]

Single successor nodes, or labeled values

A named value in MGraph is just a labeled node with a single value successor. The equals-sign is just some syntactic sugar for better read- and writability. In the “M” jargon this is called Entity, but this name is subject to change. Record structure might be a better name.

person
{
  name = "John Smith",
  age = 24
}

equals to

person
{
  name { "John Smith" },
  age { 24 }
}

Better than XML?

XML is great. Mostly because it can be read by almost every system, not because it has such a nice syntax. It was never meant for the purpose it is used for today either. It is a markup language for defining additional metadata onto text.

But what XML is broadly used for today, is configuration files, transport messages and even internal DSLs. For this kind of information, that has more structuring elements than data, XML is way to verbose.

Therefore I think MGraph with its tight syntax has the potential to become a great and broad alternative.

What do you think?

Comparing XML, JSON and MGraph

Comparison of MGraph, JSON and XML using the Google Maps geo-code of my home address.

JSON

http://maps.google.com/maps/geo?q=waltrop,%20lehmstr%201d&output=xml

661 characters except whitespaces.

{
  "name": "waltrop, lehmstr 1d",
  "Status": {
    "code": 200,
    "request": "geocode"
  },
  "Placemark": [ {
    "id": "p1",
    "address": "Lehmstraße, 45731 Waltrop, Deutschland",
    "AddressDetails": {"Country": {"CountryNameCode": "DE","CountryName": "Deutschland","AdministrativeArea": {"AdministrativeAreaName": "Nordrhein-Westfalen","SubAdministrativeArea": {"SubAdministrativeAreaName": "Recklinghausen","Locality": {"LocalityName": "Waltrop","Thoroughfare":{"ThoroughfareName": "Lehmstraße"},"PostalCode": {"PostalCodeNumber": "45731"}}}}},"Accuracy": 6},
    "ExtendedData": {
      "LatLonBox": {
        "north": 51.6244226,
        "south": 51.6181274,
        "east": 7.4046111,
        "west": 7.3983159
      }
    },
    "Point": {
      "coordinates": [ 7.4013350, 51.6212620, 0 ]
    }
  } ]
}

XML

http://maps.google.com/maps/geo?q=waltrop,%20lehmstr%201d&output=xml

1065 chars except whitespaces.



  
    waltrop, lehmstr 1d
    
      <code>200</code>
      geocode
    
    
      <address>Lehmstraße, 45731 Waltrop, Deutschland</address>
      DEDeutschlandNordrhein-WestfalenRecklinghausenWaltropLehmstraße45731
      
        
      
      7.4013350,51.6212620,0
    
  

MGraph

590 chars except whitespaces.

{
  name = "waltrop, lehmstr 1d",
  Status {
    code = 200,
    request: "geocode"
  },
  Placemark [
    {
      id = "p1",
      address = "Lehmstraße, 45731 Waltrop, Deutschland",
      AddressDetails { Country {CountryNameCode = "DE", CountryName = "Deutschland", AdministrativeArea { AdministrativeAreaName = "Nordrhein-Westfalen", SubAdministrativeArea = { SubAdministrativeAreaName = "Recklinghausen", Locality { LocalityName = "Waltrop", Thoroughfare { ThoroughfareName = "Lehmstraße" }, PostalCode = { PostalCodeNumber = "45731" }}}}}, Accuracy = 6 },
      ExtendedData {
        LatLonBox {
          north = 51.6244226,
          south = 51.6181274,
          east = 7.4046111,
          west = 7.3983159
        }
      },
      Point {
        coordinates [ 7.4013350, 51.6212620, 0 ]
      }
    }
  ]
}

Resources

Thank’s to Pinky and David L. for their help to get everything right.

Advertisements

22 thoughts on “Microsoft “Oslo” MGraph – the next XML?

  1. I think there are some nonconformitys in the xml example, I think WordPress have insert some smilies, I hope you haven’t counted this characters.

    MGraph seems to be an cool alternative to XML and JSON, its small and have stereotypes, but if it comes broad that is the big question.
    The support for XML and JSON is IMHO too big, XML in the Business Application domain and JSON in the Web/AJAX domain, so I think it will be very difficult for Microsoft to get the critical mass of installations.

  2. Nice overview of MGraph’s advantages.

    I think two nice features that XML has but MGraph is missing are:
    1) You can specify in the XML file itself which XSD schema should be used to verify it. In practice, XML documents can be self-validating.
    2) Namespaces – which lets you mix tags and attributes from different purposes. That is the key to XML’s strong composability.

    It would be nice to write a library that can convert from XML to MGraph and back…

  3. Hi Jacob,

    I think MGraph will also have the ability to link to an MSchema, which is alot compacter than XSD, too.

    With namespaces, you’re right. But whos using them as we should use them? The problem is, that the support for namespaces is still poor in some tools. But still, we’ll see.

    The idea of the whole M thing is, that data is the same if it looks alike. Using namespaces would avoid conflicts, but somehow also crash with that idea.

    MGrammar and MSchema has modules, which is kind of the same as namespaces in .NET. But I don’t know how this applies to values.

  4. Foremost, thanks for the good intro to MGraph. I never understood MGraph up until I saw your blog. So thanks a lot.

    On the other hand, I dont think MGraph will replace JSON or XML just because both XML and JSON came as open standards and MGraph and MSchema will stay in Microsoft domain, even if MS publishes it as open standard. Will have acceptance similar to WordML compared to OpenOffice.org just because this came from MS.

    Also your count for JSON and MGraph could have been same except the double quotes for each “Name” used by JSON Vs MGraph. I have not tried, but using quotes around “Name”(s) might allow for spaces to be embedded (May be I should try it tomorrow !).
    I do agree XML is very very wordy to represent simple repetable data. I almost created my own representation of metadata markup sometime ago like yaml but did not spend enough time to take it further. I am sure the following representation would be faster for parsers to consume.

    [1]Catalog;
    [1.1]Book.id=bk112;
    [1.1.1]Author=Galos, Mike;
    [1.1.2]Title=Visual Studio 7: A Comprehensive Guide;
    [1.1.3]Genre=Computer;
    [1.1.4]Price=49.94;
    [1.1.5]Publish_date=2001-04-16;
    [1.1.6]Description=Microsoft Visual Studio 7 is explored in depth, looking at how Visual Basic, Visual C++, C#, and ASP+ are integrated into a comprehensive development;
    [1.2]Book.id=bk111;
    [1.2.1]Author=O’Brien, Tim;
    [1.2.2]Title=MSXML#: A Comprehensive Guide;
    [1.2.3]Genre=Computer;
    [1.2.4]Price=36.95;
    [1.2.5]Publish_date=2000-12-01;
    [1.2.6]Description=The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.;

  5. Pingback: Dew Drop - January 14, 2009 | Alvin Ashcraft's Morning Dew

  6. Pingback: Great blog post on MGraph « Douglas Purdy

  7. Pingback: Why Oslo is Important « Critical Development

  8. I use XML and xslt daily. Not sure if I see an advantage of using MGraph over XML.

    Would like to see more articles on the subject, show me what I can do with MGraph, and why it is better than XML for web applications. Ready to be school

  9. With all due respect, I think you’re missing the big picture behind Oslo’s M et. al. In a nutshell, here’s how the world is going to change w/ respect to software development:

    Goal 1. metamodel disparate implementations.

    Goal 2. close the gap between functional specification (the WHAT) and technical specification (the HOW).

    Goal 3. put problem definition and solution specification into the hands of non-techies.

    And, the X10 productivity will be realized. In fact, that’s rather reserved for vs. the overwhelming majority of conventional paradigm and practice. X1000 or more is doable, and that’s not hype. Since I’m finding very little on concrete examples, I’m thinking of blogging some. My shop’s done this 10+ years and getting an out-of-the-box toolkit is welcome, because maintaining our tools and focusing on the end deliverable turns into X2 or more work!

    Here’s a brief rundown:

    With Goal 1 you get the ability, for example, to dataflow entities from a web page to the backend database. A data entity can pass through client-side markup and script implementation, to server-side execution implementation, to backend database implementation. If you’ve scrunched it all into a single metamodel, you can deal with it much easier in every development lifecycle aspect: conception to production support.

    With Goal 2, you get into “automated specifications” that are (a) explicit, (b) concrete in terms of requirements, use cases, etc. And, (c) automated so that you can *parse* them into *implementation* (as well as round-trip implementation back into specifications).

    With Goal 3, get out of the imperative programmaing paradigm and into declarative modeling paradigm. M et. al. is not yet-another way to do XML, JSON, etc. Rather, it is a way to *declare* the building blocks of a model. That gets parsed into metadata (see Goal 1) and from there you move into Goal 2. The data gets applied to a *pre-existing framework* which results in a full-blown production quality app being produced by someone with little-to-no tech skills, but understands his/her problem domain very well.

    With Goal 4, M will be used to define zillions of DSLs, at first. Then, you’ll see an industry shakedown into standardized DSLs for about every problem space out there. And, these won’t be anything like programming languages. They’ll be declarative in nature, coupled with graphical representations (or generated from that in many cases).

    This is where Oslo is headed. Microsoft is mum on the generator portion, the part providing the pre-built frameworks into which declarative data modes are applied. But, it’s not a new idea in the least bit.

  10. With all due respect, your’e totally missing what the intent of that post was 🙂

    I know “Oslo” and their goals quite well (just read my other posts)… This is just a post totally focussed on the data representation format in “Oslo”, which is MGraph.

    Ok, the title is a little bit provocative. But that was intended, too.

  11. I’ve been digging into MGraph a bit further, and there are two very positive points :
    1. It’s very easy to parse (like jason, unlike xml)
    2. There are representation for typed native types (unlike both jason and xml… ok for xml with typing namespaces but it becomes really a mess to parse and becomes extremly verbose).

    Available types are:
    String (both “text” and @”text” – called verbatim)
    Char (‘a’)
    Integer (1234)
    Decimal (12.34)
    Scientifics (float and doubles : 1.3E10)
    Date (2009-04-18 maps to DateTime)
    Datetime (2009-04-18T19:55:40.425 maps to DateTime)
    Time (1:20:30.123 maps to TimeSpan)
    Guid (#[xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx] )
    Binary (0x0123456789abcdef maps to byte[])

    Json just provide ints, floats and strings… the date support seems to be highly controversed…

  12. Pingback: Bookmarks [2009-11-22] (tamnd) « Scapbi02's Blog

  13. It has been mentioned before, but instead of always pointing to XML, advocates of MGraph should better compare it to YAML which is much more similar. This lack looks like the developers of M are either ignorant or blind which both does not make MGraph look very sophisticated.

  14. Pingback: Where the hell is XML successor? « Imagination Overflow

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s