Splunk FAQ
Q: What the heck is a splunk?
A: Splunk is basically a search engine for your log data. It supports many log sources such as Apache access logs, mysql database logs, and any log in standard syslog format.
Q: Ok, that sounds neat. Where can it get it and how bad is this going to hurt my pocketbook?
A: Actually, the basic version is free. It's not crippleware either, although it does nag you if you index over 500 MB a day. You can get the free download here. The Pro version could do some damage to your pocketbook, but hey, that's what budgets are for, right?
Q: Why would I want Pro if I can get the basic version for free and it's not crippled?
A: The main reason would be volume. Splunk Server (the basic version) is limited to 500MB a day and Splunk Pro pricing is dependant on how much you want to log. The other reasons are a super-set of features. The main ones being 'live splunks' , which is basically a cron that runs a search and sends out alerts if it returns any results, multiple index support, and full User support with configurable permissions.
Q: When I try to connect to Splunk via my web browser it gives me a message like "could not connect to splunkd". I have no idea what's wrong since it seemed to start just fine. Any idea what's wrong?
A: A lot of things could be wrong since that's a fairly general error message. Your best bet is to put Splunk into Debug mode and check the log at $SPLUNK_HOME/var/log/splunk/splunkd.log
Q: How do I put Splunk into Debug mode?
A: I knew you were going to ask that question. ;-) Splunk uses something called Log4CPP for it's logging. It's kinda like Log4J if you're familar with that. The config file is located at $SPLUNK_HOME/etc/log.cfg. The only line that you should change is the first one that's not a comment. It looks like this:
And you should change it to look like this:
After changing that, you'll need to restart Splunk for it to take effect. The debug logging is VERY verbose, so you might not want to run it that way for very long. I would recommend that you generate the error that you're interested in and save a copy of the log file to somewhere else to look at it before you change the setting back. Every time you restart Splunk the splunkd.log file is cleared, so make sure to save a copy of it first. If you can't figure out the problem on your own, gzip the log and send it to support@splunk.com the nice folks in support will help you out.
Q: What's this $SPLUNK_HOME thing you keep talking about? That environment variable doesn't exist on my system.
A: $SPLUNK_HOME represents where you chose to install splunk. By default, Splunk will install in /opt/splunk if you don't tell it another location. I read somewhere in their documentation that it's recommended that you set it as an environment variable on your system to make dealing with Splunk a bit easier. I'd also recommend putting $SPLUNK_HOME/bin in your PATH to make the commands there easier to access.
Q: I'm using the tailing processor and I'm having problems. The box is running at 99% cpu utilization and splunk isn't quite keeping up with my log data.
A: Yea... They're working on that, trust me :-) The good news is that there's a very good workaround. Use the fifoInput plugin. The doc on how to set it up on the splunk side of things is here. I've also written a howto on setting up syslog-ng to write to a FIFO and you may find that right here. I would highly recommend syslog-ng to anybody who is serious about setting up a logging architecture as it is more flexible than any other log server on the market [and it's free software, we like free software :-)].
Q: My splunk server is completely hosed. When I connect to it's web interface it's out to lunch or something since I just get that searching message and the three little dots. Hours later it displays some funny message about waiting for management approval. What gives??
A: Yea, I get that a lot too. Run top and sort by CPU usage [press P]. If you have one splunkd process running at 99%, then I'm sorry friend, but you are indeed hosed. Try a splunk stop, wait a few minutes, and if the splunkd processes are still running, do a `killall -9 splunkd`, or the equivalent for your platform. After the process massacre is over, you'll need to clean out your database completely. Unfortunately, as of version 1.2, there is no easy way to do that, unless you want to run a bunch of rm commands by hand. So I made some modifications to the 'splunk' script in /opt/splunk/bin to add back in the old functionality. I also made a few other tweaks to the script to fix the annoying tail error message and make it a bit more cross-platform safe. You can download a gzip archive of it here.
UPDATE: On this particular issue, updating to version 1.2.4 fixed the database lockup problem for me! I will keep this info here though in case future problems yield the same results. It's also useful for anyone wishing to zero out their database.
Q: I have a Windows machine that I need logs from. How do I set that up??
A: You'll be amazed at how easy this is. First off, you'll need some type of syslog server. If you're running Splunk Professional, it has a syslog server built in which makes it even easier. If you're installing a seperate syslog server to collect logs, I'd recommend syslog-ng. There are several excellent syslog agents available for Windows, and your choice should be based on your needs. For most applications, Snare will do nicely. It's free [remember, we like free!], open source software, with a simple installer and straightforward GUI configuration. There is also great documentation available on their website. Snare is designed to send Windows Eventlog to a remote syslog server, and only that. If you need to send IIS logs or ISA logs, Snare has agents that can do that too. If your needs are more complex, for instance if you need to pick up an arbitrary log file and send it, or you need advanced filtering on the front end, I'd recommend Adiscon Monitorware Agent. It's the swiss army knife of Windows log agents and is available for about $100 a copy.
Q: Wow!! Splunk is a really neat tool! Now how do I use it?
A: The best way to get up to speed quickly on what you can do with Splunk is to download a copy of the Splunk Cheat Sheet from Corey Shield's website.




