gregjeanmart

5 min read - Posted 10 Dec 19

Getting started with Mahuta - A Search engine for the IPFS

Mahuta (formerly known as IPFS-Store) is a convenient library and API to aggregate and consolidate files or documents stored by your application on the IPFS network. It provides a solution to collect, store, index and search data used.

Features

  • Indexation: Mahuta stores documents or files on IPFS and index the hash with optional metadata.
  • Discovery: Documents and files indexed can be searched using complex logical queries or fuzzy/full text search)
  • Scalable: Optimised for large scale applications using asynchronous writing mechanism and caching
  • Replication: Replica set can be configured to replicate (pin) content across multiple nodes (standard IPFS node or IPFS-cluster node)
  • Multi-platform: Mahuta can be used as a simple embedded Java library for your JVM-based application or run as a simple, scalable and configurable Rest API.

Mahuta.jpg


Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Mahuta depends of two components: - an IPFS node (go or js implementation) - a search engine (currently only ElasticSearch is supported)

See how to run those two components first run IPFS and ElasticSearch

Java library

  1. Import the Maven dependencies (core module + indexer)
<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-core</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>
<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-indexing-elasticsearch</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>
  1. Configure Mahuta to connect to an IPFS node and an indexer
Mahuta mahuta = new MahutaFactory()
    .configureStorage(IPFSService.connect("localhost", 5001))
    .configureIndexer(ElasticSearchService.connect("localhost", 9300, "cluster-name"))
    .defaultImplementation();
  1. Execute high-level operations
IndexingResponse response = mahuta.prepareStringIndexing("article", "## This is my first article")
    .contentType("text/markdown")
    .indexDocId("article-1")
    .indexFields(ImmutableMap.of("title", "First Article", "author", "greg"))
    .execute();
    
GetResponse response = mahuta.prepareGet()
    .indexName("article")
    .indexDocId("article-1")
    .loadFile(true)
    .execute();
    
SearchResponse response = mahuta.prepareSearch()
    .indexName("article")
    .query(Query.newQuery().equals("author", "greg"))
    .pageRequest(PageRequest.of(0, 20))
    .execute();

For more info, Mahuta Java API

Spring-Data

  1. Import the Maven dependencies
<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-springdata</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>
  1. Configure your spring-data repository
@IPFSDocument(index = "article", indexConfiguration = "article_mapping.json", indexContent = true)
public class Article {
    
    @Id
    private String id;

    @Hash
    private String hash;

    @Fulltext
    private String title;

    @Fulltext
    private String content;

    @Indexfield
    private Date createdAt;

    @Indexfield
    private String createdBy;
}



public class ArticleRepository extends MahutaRepositoryImpl<Article, String> {

    public ArticleRepository(Mahuta mahuta) {
        super(mahuta);
    }
}

For more info, Mahuta Spring Data

HTTP API with Docker

Prerequisites
Docker
$ docker run -it --name mahuta \ 
    -p 8040:8040 \
    -e MAHUTA_IPFS_HOST=ipfs \
    -e MAHUTA_ELASTICSEARCH_HOST=elasticsearch \
    gjeanmart/mahuta
Docker Compose

Docker-compose sample file

Examples

To access the API documentation, go to Mahuta HTTP API

Create the index article
  • Sample Request:
curl -X POST \
  http://localhost:8040/mahuta/config/index/article \
  -H 'Content-Type: application/json'
  • Success Response:

    • Code: 200
      Content: { "status": "SUCCESS" }
Store and index an article and its metadata
  • Sample Request:
curl -X POST \
  'http://localhost:8040/mahuta/index' \
  -H 'content-type: application/json' \
  -d '{"content":"# Hello world,\n this is my first file stored on **IPFS**","indexName":"article","indexDocId":"hello_world","contentType":"text/markdown","index_fields":{"title":"Hello world","author":"Gregoire Jeanmart","votes":10,"date_created":1518700549,"tags":["general"]}}'
  • Success Response:

    • Code: 200
      Content: { "indexName": "article", "indexDocId": "hello_world", "contentId": "QmWHR4e1JHMs2h7XtbDsS9r2oQkyuzVr5bHdkEMYiqfeNm", "contentType": "text/markdown", "pinned": true, "indexFields": { "title": "Hello world", "author": "Gregoire Jeanmart", "votes": 10, "createAt": 1518700549, "tags": [ "general" ] }, "status": "SUCCESS" }
Search by query
  • Sample Request:
curl -X POST \
 'http://localhost:8040/mahuta/query/search?index=article' \
 -H 'content-type: application/json' \
 -d '{"query":[{"name":"title","operation":"CONTAINS","value":"Hello"},{"name":"author.keyword","operation":"EQUALS","value":"Gregoire Jeanmart"},{"name":"votes","operation":"GT","value":"5"}]}'
  • Success Response:

    • Code: 200
      Content:
{
  "status": "SUCCESS",
  "page": {
    "pageRequest": {
      "page": 0,
      "size": 20,
      "sort": null,
      "direction": "ASC"
    },
    "elements": [
      {
        "metadata": {
          "indexName": "article",
          "indexDocId": "hello_world",
          "contentId": "Qmd6VkHiLbLPncVQiewQe3SBP8rrG96HTkYkLbMzMe6tP2",
          "contentType": "text/markdown",
          "pinned": true,
          "indexFields": {
            "author": "Gregoire Jeanmart",
            "votes": 10,
            "title": "Hello world",
            "createAt": 1518700549,
            "tags": ["general"]
          }
        }
      }
    ],
    "totalElements": 1,
    "totalPages": 1
  }
}
Created with Sketch.Content is"CC-BY-SA 4.0" licensed
Article On-chain
Article Author

Grégoire Jeanmart

Kauri Software Engineer

27

3

7

0 Comments
Related Articles
(3/5) Collect metrics with Elastic Metricbeat for monitoring Kubernetes

Metricbeat is a lightweight shipper installed on a server to periodically collect metrics from the host and services running. This represents the first pillar of observability to monitor our stack. Metricbeat captures by default system metrics but also includes a large list of modules to capture specific metrics about services such as proxy (NGINX), message bus (RabbitMQ, Kafka), Databases (MongoDB, MySQL, Redis) and many others (find the full list here) Prerequisite - kube-state-metrics First,

(1/5) Getting started with Elastic Stack for monitoring on Kubernetes

Introduction In this article, we will learn how to set up a monitoring stack for your Kubernetes environment (k8s in short). This kind of solution allows your team to gain visibility on your infrastructure and each application with a minimal impact on the existing. The goal of observability is to provide tools to operators responsible of running the production to detect undesirables behaviours (service downtime, errors, slow responses) and have actionable information to find the root cause of an