{"id":64,"date":"2012-10-21T01:52:57","date_gmt":"2012-10-21T01:52:57","guid":{"rendered":"http:\/\/clayb.net\/blog\/?p=64"},"modified":"2012-10-21T01:52:57","modified_gmt":"2012-10-21T01:52:57","slug":"configuration-files","status":"publish","type":"post","link":"https:\/\/clayb.net\/blog\/configuration-files\/","title":{"rendered":"Configuration Files"},"content":{"rendered":"<p>Many systems have requirements to store configuration parameters. In these systems, a number of choices can be made for how to store that data; sometimes this diversity is painful, however. Choices for storing configuration data are often:<\/p>\n<ul>\n<li>Firefox uses and Apple often chooses to use <a href=\"http:\/\/sqlite3.org\">sqlite3<\/a> databases<sup><a href=\"http:\/\/www.sqlite.org\/famous.html\">1<\/a><\/sup><\/li>\n<li>Python programs often use <a href=\"docs.python.org\/library\/ConfigParser.html\">ConfigParser<\/a> to process\u00c2\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/INI_file\"><tt>initialization (<tt>.ini<\/tt>) files<\/tt><\/a><\/li>\n<li><a href=\"http:\/\/ant.apache.org\/\">Apache Ant<\/a> amongst many other applications, consume XML configurations<\/li>\n<li>Java programs often use <a href=\"http:\/\/download.oracle.com\/javase\/7\/docs\/api\/java\/util\/Properties.html\">Properties<\/a> files &#8212; in XML or traditional form<\/li>\n<li>Java Script Object Notation (<a href=\"http:\/\/json.org\">JSON<\/a>) is used by programs for configuration; a number of my group&#8217;s programs use this, for example<\/li>\n<li>Domain Specific Languages (DSLs) are sometimes used. For example, the <a href=\"http:\/\/puppetlabs.com\">Puppet<\/a> configuration management system has its own DSL written in Ruby<\/li>\n<\/ul>\n<p>This diversity of configuration formats sometimes sees cross pollination, however. Sometimes, an application only reads in one format but another application only outputs another format. Sometimes, one has a toolset which works with only one and many an application grown organically can find itself using many formats itself.<\/p>\n<p>Annoyingly, not all formats support the same set of features either. For example, SQLite3 and XML can be multidimensional; SQLite3 supports multiple N-row by M-columns sized tables in a SQLite3 file, while XML support a hierarchical tree structure of tags with with multiple leaves using attributes on tags. JSON is comparable to XML, offering rich structure for organizing one&#8217;s data. The initialization file implementation in Python is only a two-level hierarchy; Java Properties files are flat but often use Java dot-notation to make namespaces which can represent an arbitrarily deep hierarchy. Domain specific languages can be as rich or simple as desired, but there is no commonality or properties inherent in such a configuration format.<\/p>\n<p>This asymmetry can make conversion across formats difficult in general but one should always be able to go from a less rich to a more rich structure. And when possible, it is nice to have some tools to go between them.<\/p>\n<h2>Java Properties Files<\/h2>\n<h3>Using with Python<\/h3>\n<p>One can find a <a href=\"http:\/\/code.activestate.com\/recipes\/496795-a-python-replacement-for-javautilproperties\/\">recipe<\/a> to read and write Java Properties files from Python. This re-implementation of the<tt> java.util.Properties<\/tt> class provides a convenient interface for working with properties files:<\/p>\n<pre>&gt;&gt;&gt; import properties\r\n&gt;&gt;&gt; p=properties.Properties()\r\n&gt;&gt;&gt; with file(\"my.properties\") as f:\r\n...     p.read(f)\r\n&gt;&gt;&gt; p.getPropertyDict()['some_property_I_want']\r\n'this_is_not_the_property_value_you_want!'\r\n&gt;&gt;&gt; p.setProperty('some_property_I_want', 'with_the_value_I_want!')\r\n&gt;&gt;&gt; with file(\"my.properties\") as f:\r\n...     p.store(f)<\/pre>\n<h3>Properties in XML<\/h3>\n<p>One can write an XML version of a Java properties file within Java by simply calling the <a href=\"http:\/\/docs.oracle.com\/javase\/6\/docs\/api\/java\/util\/Properties.html#storeToXML%28java.io.OutputStream,%20java.lang.String%29\"><tt>storeToXML()<\/tt><\/a> method on a <tt>Properties()<\/tt> object.<\/p>\n<h3>Oozie&#8217;s XML outputs<\/h3>\n<p>I use a lot of Hadoop programs which store their outputs in various XML forms, but one which always drives me nuts is <a href=\"http:\/\/incubator.apache.org\/oozie\/\">Apache Oozie<\/a>. Oozie will dump out a workflow job configuration in XML; but not a standard Java XML properties file. Oozie takes in the workflow properties as a non-XML Java properties file provided but it will not accept the XML it produces. However, via the joys of <a href=\"http:\/\/www.w3.org\/TR\/xslt\">XML Style Sheet Transforms<\/a>, we can write a simple script which can convert between the two!<\/p>\n<p><strong>An example (Oozie) Properties file in XML:<\/strong><\/p>\n<pre>&lt;configuration&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;date&lt;\/name&gt;\r\n    &lt;value&gt;2011-12-01T00:00Z&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;endTime&lt;\/name&gt;\r\n    &lt;value&gt;2011-12-01T23:59Z&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;frequency&lt;\/name&gt;\r\n    &lt;value&gt;1440&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;group.name&lt;\/name&gt;\r\n    &lt;value&gt;users&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;jobTracker&lt;\/name&gt;\r\n    &lt;value&gt;jobtracker.example.com:9001&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;nameNode&lt;\/name&gt;\r\n    &lt;value&gt;hdfs:\/\/namenode.example.com:9000&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;oozie.coord.application.path&lt;\/name&gt;\r\n    &lt;value&gt;\/export\/my_workflow\/coordinator.xml&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;oozie.wf.application.path&lt;\/name&gt;\r\n    &lt;value&gt;hdfs:\/\/namenode.example.com:9000\/user\/john_doe\/my_workflow\/workflow.xml&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;queueName&lt;\/name&gt;\r\n    &lt;value&gt;default&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;startTime&lt;\/name&gt;\r\n    &lt;value&gt;2011-12-01T00:00Z&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;user.name&lt;\/name&gt;\r\n    &lt;value&gt;john_doe&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n&lt;\/configuration&gt;<\/pre>\n<p><strong>General XSLT transformation from XML to Java properties file<\/strong><\/p>\n<pre>&lt;?xml version=\"1.0\" encoding=\"ISO-8859-1\"?&gt;\r\n&lt;xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http:\/\/www.w3.org\/1999\/XSL\/Transform\"&gt;\r\n&lt;xsl:output method=\"text\" version=\"1.0\" omit-xml-declaration=\"yes\"\/&gt;\r\n  &lt;xsl:template match=\"\/*\"&gt;\r\n    &lt;xsl:for-each select=\"property\"&gt;\r\n      &lt;xsl:value-of select=\"name\"\/&gt;&lt;xsl:text&gt;=&lt;\/xsl:text&gt;&lt;xsl:value-of select=\"value\"\/&gt;&lt;xsl:text&gt;&amp;#xa;&lt;\/xsl:text&gt;\r\n    &lt;\/xsl:for-each&gt;\r\n  &lt;\/xsl:template&gt;\r\n&lt;\/xsl:stylesheet&gt;<\/pre>\n<p><strong>Resulting Java properties file<\/strong><\/p>\n<pre>date=2011-12-01T00:00Z\r\nendTime=2011-12-01T23:59Z\r\nfrequency=1440\r\ngroup.name=users\r\njobTracker=jobtracker.example.com:9001\r\nnameNode=hdfs:\/\/namenode.example.com:9000\r\noozie.coord.application.path=\/export\/my_workflow\/coordinator.xml\r\noozie.wf.application.path=hdfs:\/\/namenode.example.com:9000\/user\/john_doe\/my_workflow\/workflow.xml\r\nqueueName=default\r\nstartTime=2011-12-01T00:00Z\r\nuser.name=john_doe<\/pre>\n<p>For those who are not very programming language literate, on Linux, one can nicely use the simple <a href=\"http:\/\/www.xmlsoft.org\/\">libxml<\/a> tool <a href=\"http:\/\/xmlsoft.org\/xslt\/xsltproc2.html\"><tt>xsltproc(1)<\/tt><\/a> to run this conversion. For example, to take in <tt>my_config<\/tt> in Java properties XML format and product the same file in Java properties format one would run: <tt>xsltproc to_property.xslt my_config.xml &gt; my_config.properties<\/tt><\/p>\n<h2>JSON<\/h2>\n<p>JSON provides a rich language for expression similar to XML. JSON is often used for data interchange, now often used in AJAX web-requests, etc. However, JSON,<\/p>\n<h3>Using with Python<\/h3>\n<p>Python has a very feature-rich JSON module which takes the JSON objects and arrays and all their pairs and members representing them akin to native Python <tt>list()<\/tt> and <tt>dict()<\/tt> objects. Further, the JSON module can provide very rich encoding and decoding functionality, as evidenced in the module&#8217;s <a href=\"docs.python.org\/library\/json.html\">PyDoc<\/a> and particular when <a href=\"http:\/\/docs.python.org\/library\/json.html#encoders-and-decoders\">using<\/a> hooks for encoding and decoding.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Many systems have requirements to store configuration parameters. In these systems, a number of choices can be made for how to store that data; sometimes this diversity is painful, however. Choices for storing configuration data are often: Firefox uses and Apple often chooses to use sqlite3 databases1 Python programs often use ConfigParser to process\u00c2\u00a0initialization (.ini) [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14,17],"tags":[],"_links":{"self":[{"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/posts\/64"}],"collection":[{"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/comments?post=64"}],"version-history":[{"count":0,"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/posts\/64\/revisions"}],"wp:attachment":[{"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/media?parent=64"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/categories?post=64"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/clayb.net\/blog\/wp-json\/wp\/v2\/tags?post=64"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}