-
Notifications
You must be signed in to change notification settings - Fork 0
Elastic search
Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time.
A cluster is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes
A node is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities.
An index is a collection of documents that have somewhat similar characteristics
Within an index, you can define one or more types. A type is a logical category/partition of your index whose semantics is completely up to you
A document is a basic unit of information that can be indexed. This document is expressed in JSON (JavaScript Object Notation) which is an ubiquitous internet data interchange format
Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. When you create an index, you can simply define the number of shards that you want. Each shard is in itself a fully-functional and independent "index" that can be hosted on any node in the cluster.
To this end, Elasticsearch allows you to make one or more copies of your index’s shards into what are called replica shards, or replicas for short.
- Install the package for your distribution
- Start our node and single cluster:
# elasticsearch
......................
[2015-04-03 14:33:37,572][INFO ][node] [The Entity] started
We can see that our node named "The Entity" (which is a random Marvel character) has started.
Tip: to override either the cluster or node name, run the command this way:
# elasticsearch --cluster.name my_cluster_name --node.name my_node_name
Note:
- It works with java-7-openjdk and java-8-openjdk.
- elastisearch is running by default on port 9200
- Check your cluster, node, and index health, status, and statistics
- Administer your cluster, node, and index data and metadata
- Perform CRUD (Create, Read, Update, and Delete) and search operations against your indexes
- Execute advanced search operations such as paging, sorting, filtering, scripting, faceting, aggregations, and many others
$ curl 'localhost:9200/_cat/health?v'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks
1428073032 16:57:12 elasticsearch green 1 1 0 0 0 0 0 0
"elasticsearch" is up with a green status, 1 node and 0 shards.
% curl 'localhost:9200/_cat/nodes?v'
host ip heap.percent ram.percent load node.role master name
hortensia 127.0.0.1 3 33 0.29 d * The Entity
One node The Entity running.
$ curl 'localhost:9200/_cat/indices?v'
Above command returns the list of all indexes.
Here we create the agent index
$ curl -XPUT 'localhost:9200/agent?pretty'
{
"acknowledged" : true
}
$ curl 'localhost:9200/_cat/indices?v'
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open agent 5 1 0 0 575b 575b
Our new agent has been created and has the yellow status. The reason is Elasticsearch by default created one replica for this index. Since we only have one node running at the moment, that one replica cannot yet be allocated (for high availability) until a later point in time when another node joins the cluster. Once that replica gets allocated onto a second node, the health status for this index will turn to green.
Let's create a JSON document: { "name": "agreenmamba" }
$ curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
{
"name": "agreenmamba"
}'
Reponse:
{
"_index" : "agent",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"created" : true
}
$ curl -XGET 'localhost:9200/customer/external/1?pretty'
{
"_index" : "agent",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source":{
"name": "agreenmamba"
}
}
$ curl -XDELETE 'localhost:9200/agent?pretty'