This site has been quiet the last five weeks, but many good things happened in the background. One of those good things has been progress on a small Netwar System demonstrator virtual machine, tentatively named the Community Edition.
What can you do with Netwar System CE? It supports using one or two Twitter accounts to record content on an ongoing basis, making the captured information available via the Kibana graphical front end to Elasticsearch. Once the accounts are authorized the system checks them every two minutes for any list that begins with “nsce-“, and accounts on those lists are recorded.
Each account used for recording produces a tw<name> index containing tweets and a tu<name> index containing the profiles of the accounts.
The tw* and tu* are index patterns that cover the respective content from all three accounts. The root account is the system manager and we assume users might place a set of API tokens on that account for command line testing.
This is a view from Kibana’s Discovery tab. The timeframe can be controller via the time picker at the upper right, the Search box permits filtering, the activity per date histogram appears at the top, and in the case we can see a handful of Brexit related tweets.
There are a variety of visualization tools within Kibana. He we see a cloud of hashtags used by the collected accounts. The time picker can be adjusted to a certain time frame, search terms may be added so that the cloud reflects only hashtags used in conjunction with the search term, and there are many further refinements that can be made.
What does it take to run Netwar System CE? The following is a minimal configuration of a desktop or laptop that could host it:
- 8 gig of ram
- solid state disk
- four core processor
There are entry level Dell laptops on Amazon with these specifications in the $500 range.
The VM itself is very light weight – two cores, four gig of ram, and the OVA file for the VM is just over four gig to download.
As shipped, the system has the following limits:
- Tracking via two accounts
- Disk space for about a million tweets
- Collects thirty Twitter accounts per hour per account
If you are comfortable with the Linux command line it is fairly straightforward to add additional accounts. If you have some minimal Linux administration capabilities you could add a virtual disk, relocate the Elasticsearch data, and have room for more tweets.
If you are seeking to do a larger project, you should not just multiply these numbers to determine overall capacity. An eight gig VM running our adaptive code can cover about three hundred accounts per hour and a sixty four gig server can exceed four thousand.