I’m going to try out CoreOS to create a cluster of machines to host an app consisting of docker containers.
CoreOS is a linux distribution designed to run clusters of containers efficiently and securely. Our application components run in Docker containers, organized as services. The distribution also includes etcd, a key/value store for distributed configuration management and service discovery; and fleet to manage services across clusters. The machines can be automatically updated with any security fixes and patches.
First, let’s create a CloudFormation template we will use to create our stack. I got a minimal template for CoreOS here, adding parameters for VPC subnets and related availability zones. I also updated the AMI for us-east-1, where I’m deploying this cluster, to the most recent stable version of CoreOS.
We’ve decided to use docker containers to package and deploy application components as nano services. This post documents initial experimentation to build and run a sample application as a service.
The application is a spring integration stomp over websockets chat sample app. It is a java application running in an embedded tomcat container, packaged as a stanadalone (uber) jar.
This type of application seems to be a great fit for a docker container
self-contained; includes dependencies
provides a service
consists of one process
can be used by other services (or people but we’ll pretend it’s a messaging service for our applications)
We now have the apollo broker running on our virtual host and URI on our webserver to hit to generate messages.We want to display those messages near real-time in a web browser on the host machine.
Our setup
web server and PHP backend app are running on our virtual machine (IP=10.0.0.10).
apache apollo broker is running on the virtual machine.
our webapp server will send a test message to tcp://localhost:61613/topic/okcra-api-ops when URI /sendStompMessage is hit.
added a firewall rule on virtual machine to allow access to port tcp/61623 where apollo is listening for websocket connections.
web client
Apollo is shipped with example code, one of which is a websockets example. The html page is located at /examples/stomp/websocket/index.html.
We will load that page into our browser, fill in our connection details, and connect to the apollo broker on our virtual hosts. When successful, it will wait and display any new messages appearing in the topic or queue we have registered interest in.
puPHPet is a nifty tool to help configure a virtual PHP development evironment. It uses vagrant to manage virtual machines with puppet to configure the machine. It’s a great start but we need some additional configurations. This is the story of extending puPHPet to our needs.
puppet hiera
Recent puPHPpet uses puppet’s hiera facility to provide configuration information to puppet at runtime. I would like to utilize hiera for our additional configurations but puPHPet only seems to utilize hiera sources for a small subset of parameters. puPHPpet help mentions a common.yaml hiera file but the code uses config.yaml instead. That’s confusing but ../puphpet/puppet/hiera.yaml spells it out for us.
Getting started with Apache Apollo, follow-up project to ActiveMQ
Apache apollo is an all-dancing, all-singing message broker, queue manager, integration engine, etc. written in scala, running on a JVM. It is the follow on to activeMQ, written in scala.
distributed, highly available, redundant, and horizontally scalable architecture
document store using API language clients or HTTP REST interface
indexes are like tables in RDBMS, types are like tables
Every field in a document is indexed and can be queried.
CRUD operations are easy
optimistic concurrency control on update/delete ops using version parameter
automatic versioning
update operation merges desired changes
groovy scripting by default, available in request body
idempotent operations retry on update with retry_on_conflict parm
bulk operations
each document in an index has a type. Every type has its own mapping or schema definition.
simple value searches, ranges, etc
indexed, analyzed (tokenization, normalization), analyze API
_all is a system-generated full-text field which can be disabled
full-text search
relevance scores in full-text search results
phrase searches
highligting results
sorting results, relevance by default or can specify with sort parm
filter DSL (term, terms, range, exists, missing, bool,)
query DSL (match_all, match, multi_match, bool,)
analytics
aggregations, nested
This post will begin our look at querying elasticsearch directly, via it’s search API. We’ve looked at reporting and graphing tools like Kibana which provide some veneer over the actual queries. Now we’ll see what the queries and responses look like under the covers.
The first query we’ll make will search an entire index with no filter provided - we will just dump the data content.
The API is accessible via an HTTP or HTTPS URI using the POST command. There are many search flavors available, documented in detail at the elasticsearch search API; we’ll just touch the surface here.
The search API is accessible using a query parameter or request body. The query parameter is limited but good for some testing so we’ll use that first.
The simplest search query ever …
The URI structure to invoke the simplest elasticsearch query API looks like this: http(s)://logsene-receiver.sematext.com/OUR-LOGSENE-APP-TOKEN/_search
Kibana is the name of a visualization tool for elasticsearch; it runs in your web browser. Kibana enables you to query and view records from your elasticsearch repository. It’s easy to host Kibana yourself or, as we are doing here, use a hosted version.
The data and query interface are the same as we saw in the reporting entry, of which we’ll see more detail when we get to elasticsearch.
Now that our log records are stored and queryable by the logging service we’d like to make use of our data. We can generate reports about whatever interests us, be it usage patterns, errors, etc.
The log records can be parsed into discrete fields which can be indexed and subsequently searched and filtered using some exposed form of query. This is extremely valuable with distributed log sets.
For example I can ask to see all of yesterday’s apache access log entries for a particular application where the HTTP return code was >= 400, whether the application was running on one, a dozen, or a hundred servers.
This entry will explore Logsene’s reporting interface. We’ll also report on Logsene’s more obvious elastic search interfaces in later entries.