[nycbug-talk] version control for config files

Sun Mar 8 00:59:56 EST 2009

Puppet is a really good option for config management, I looked really  
seriously at it a while back and liked what I saw a lot, but I had two  
major problems with it.  First, it's written in Ruby and I'm not a  
Ruby programmer, so that kind of bites if I wanted to extend it.   
Second, and nearly every configuration management system I've looked  
at suffers from this problem, there's a pretty serious bootstrap cost  
to implementing it.  It's a reasonably complex system, it's non- 
trivial to understand how to make it do what you want it to do, and  
when you do get a good handle on it, converting all your systems to  
use it can be a pretty serious undertaking, depending on how many you  
have.  At the time, I was looking at a couple hundred, with  
configurations drifted all over hell and gone, so that was a time sink  
I couldn't afford.

What I ended up doing is writing a stupid simple configuration manager  
in python, we call it ghetto-config in the office.  I've actually been  
thinking about asking to open source it, I'll talk to my boss about it  
on Monday, but the basic concepts are simple, and it didn't take me  
more than an afternoon to implement a first cut.

Ghetto-config works as a very simple templating engine, you define a  
number bunch of key value pairs, and you can use those keys in a file  
and have ghetto-config substitute in the appropriate value for you  
when it parses it.  It also understands how to set file modes, create  
symlinks, manage owners and groups, and how to diff version of a file  
if it's changed.

Basically, you assign each machine a unique ID of some sort at install  
time (or later, if you want), we used the MAC address of the interface  
that was used to PXE boot the machine for kickstart, but you could use  
hostname or whatever if that was easier.  That serves as the unique  
identifier for the system.  Ghetto-config takes that information and  
uses it to build a URL to fetch configuration information from a  
central HTTP server storing config data.  It fetches the URL it builds  
and gets back a file in config parser syntax (basically an .ini file)  
with a couple special sections.  The first section is includes, so it  
can include in other config parser syntax files.  The second is  
definitions, it allows to set up key-value pairs, like $eth0-ip$ =  
192.168.1.15, and the third set of managed file sections.

Each managed file section gives the URL to fetch the template from (so  
you can use the same template file for multiple machines by just  
pointing to the URL of a canonical version) and the location on the  
file system to write the rendered template to once the substitutions  
have been performed.  It optionally includes a file mode, owner,  
group, and the location of a symlink to make to the file location.

There's some additional detail in structuring the central config  
server and doing some other stuff that makes it simpler to manage, but  
that's the gist of it.  It's about two hundred lines of python, it  
supports doing diffs between the central configuration and the local  
reality for any attribute it understands (so file contents, owner,  
group, mode, etc.) as well as a programmatic mode intended to be run  
from cron that'll tell you whether anything has changed on the machine  
so you can make sure none of your machines have drifted nightly, etc.   
We've started checking the central config tree into SVN so we have an  
audit trail of who did what to what file.

Does it do everything puppet does?  God no.  Does it do everything  
cfengine does?  Again, god no.  It's an 80/20 solution, it took me an  
afternoon to write the first version of it, and maybe a week of total  
development time to get it doing what it does and full test coverage  
for it.  It happily manages several hundred machines, it makes  
installing and provisioning a new machine a thirty second job instead  
of a half hour or so, and it makes managing software installs over  
multiple machines much, much easier and more deterministic.

Like I said, I've been planning on asking about open sourcing it  
anyway, but if I don't get to do that, I'll be happy to answer  
questions or give pointers where I can.

--Dave
On Mar 8, 2009, at 12:36 AM, Brian Gupta wrote:

> I know you think it is overkill, but seriously take a look at puppet.
> (I'm biased here). Basically it lets you put all your configs on a
> central server and then use SVN/GIT/whatever to manage it.
>
> I'd be willing to help you get started, and we have a puppet user
> group in NYC. (Although we would be open to expanding it to include
> cfengine, if there are enough cfengine folks around).
>
> -Brian
>
> On Thu, Feb 26, 2009 at 6:27 PM, Charles Sprickman <spork at bway.net>  
> wrote:
>> Howdy,
>>
>> I think the subject pretty much sums it up - I'm sick of not tracking
>> changes in /etc and /usr/local/etc.  I want something that deal  
>> with file
>> permissions and is relatively transparent.
>>
>> I've been googling around, but finding not much other than weird
>> contortions based on CVS that make such huge disclaimers as "this of
>> course does not work with symlinks" or "this of course does not  
>> maintain
>> file ownership/permissions".
>>
>> cfengine and the like do more than I want...
>>
>> Any interesting ideas out there?
>>
>> The most I have to work with is perhaps a dozen servers, maybe  
>> almost that
>> many jails.
>>
>> Thanks,
>>
>> Charles
>>
>> ___
>> Charles Sprickman
>> NetEng/SysAdmin
>> Bway.net - New York's Best Internet - www.bway.net
>> spork at bway.net - 212.655.9344
>>
>> _______________________________________________
>> talk mailing list
>> talk at lists.nycbug.org
>> http://lists.nycbug.org/mailman/listinfo/talk
>>
>
>
>
>
> -- 
> - Brian Gupta
>
> New York City user groups calendar:
> http://nyc.brandorr.com/
> _______________________________________________
> talk mailing list
> talk at lists.nycbug.org
> http://lists.nycbug.org/mailman/listinfo/talk
>