From 07ccab9766a6289326676a4814537564f25f35fa Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Mon, 23 Aug 2021 20:27:16 +0300 Subject: Add search/meilisearch documentation --- docs/configuration/search.md | 99 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 docs/configuration/search.md (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md new file mode 100644 index 000000000..14ec2bc63 --- /dev/null +++ b/docs/configuration/search.md @@ -0,0 +1,99 @@ +# Configuring search + +{! backend/administration/CLI_tasks/general_cli_task_info.include !} + +## Built-in search + +To use built-in search that has no external dependencies, set the search module to `Pleroma.Activity`: + +> config :pleroma, Pleroma.Search, module: Pleroma.Activity + +While it has no external dependencies, it has problems with performance and relevancy. + +## Meilisearch + +To use [meilisearch](https://www.meilisearch.com/), set the search module to `Pleroma.Search.Meilisearch`: + +> config :pleroma, Pleroma.Search, module: Pleroma.Search.Meilisearch + +You then need to set the address of the meilisearch instance, and optionally the private key for authentication. + +> config :pleroma, Pleroma.Search.Meilisearch, +> url: "http://127.0.0.1:7700/", +> private_key: "private key" + +Information about setting up meilisearch can be found in the +[official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html). +You probably want to start it with `MEILI_NO_ANALYTICS=true` and `MEILI_NO_CENTRY=true` environment variables, +to disable analytics. + +### Private key authentication (optional) + +To set the private key, use the `MEILI_MASTER_KEY` environment variable when starting. After setting the _master key_, +you have to get the _private key_, which is actually used for authentication. + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch show-private-key + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch show-private-key + ``` + +This is the key you actually put into your configuration file. + +### Initial indexing + +After setting up the configuration, you'll want to index all of your already existsing posts. Only public posts are indexed. You'll only +have to do it one time, but it might take a while, depending on the amount of posts your instance has seen. This is also a fairly RAM +consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2 +million posts while idle and up to 7G while indexing initially, but your experience may be different). + +To start te initial indexing, run the `index` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch index + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch index + ``` + +This will show you the total amount of posts to index, and then show you the amount of posts indexed currently, until the numbers eventually +become the same. The posts are indexed in big batches and meilisearch will take some time to actually index them, even after you have +inserted all the posts into it. Depending on the amount of posts, this may be as long as several hours. To get information about the status +of indexing and how many posts have actually been indexed, use the `stats` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch stats + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch stats + ``` + +### Clearing the index + +In case you need to clear the index (for example, to re-index from scratch, if that needs to happen for some reason), you can +use the `clear` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch clear + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch clear + ``` + +This will clear **all** the posts from the search index. Note, that deleted posts are also removed from index by the instance itself, so +there is no need to actually clear the whole index, unless you want **all** of it gone. That said, the index does not hold any information +that cannot be re-created from the database, it should also generally be a lot smaller than the size of your database. Still, the size +depends on the amount of text in posts. -- cgit v1.2.3 From c569ad05b3d812c87171e68eac79eec749321033 Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Tue, 12 Oct 2021 19:14:39 +0300 Subject: Add more documentation about rum to meilisearch docs --- docs/configuration/search.md | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index 14ec2bc63..e9743f1a4 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -12,6 +12,15 @@ While it has no external dependencies, it has problems with performance and rele ## Meilisearch +Note that it's quite a bit more memory hungry than PostgreSQL (around 4-5G for ~1.2 million +posts while idle and up to 7G while indexing initially). The disk usage for this additional index is also +around 4 gigabytes. Like [RUM](./cheatsheet.md#rum-indexing-for-full-text-search) indexes, it offers considerably +higher performance and ordering by timestamp in a reasonable amount of time. +Additionally, the search results seem to be more accurate. + +Due to high memory usage, it may be best to set it up on a different machine, if running pleroma on a low-resource +computer, and use private key authentication to secure the remote search instance. + To use [meilisearch](https://www.meilisearch.com/), set the search module to `Pleroma.Search.Meilisearch`: > config :pleroma, Pleroma.Search, module: Pleroma.Search.Meilisearch -- cgit v1.2.3 From 8898b5e927bae27a521e4eadd0faf970ad27c5bc Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Sun, 14 Nov 2021 20:15:12 +0300 Subject: Fix a typo in search docs --- docs/configuration/search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index e9743f1a4..9adc7884f 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -60,7 +60,7 @@ have to do it one time, but it might take a while, depending on the amount of po consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2 million posts while idle and up to 7G while indexing initially, but your experience may be different). -To start te initial indexing, run the `index` command: +To start the initial indexing, run the `index` command: === "OTP" ```sh -- cgit v1.2.3 From a6946048fbe049aa223d094d36eb767739ab5ff2 Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Wed, 17 Nov 2021 22:29:49 +0300 Subject: Rename Activity.Search to Search.DatabaseSearch --- docs/configuration/search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index 9adc7884f..c7e77d9c2 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -6,7 +6,7 @@ To use built-in search that has no external dependencies, set the search module to `Pleroma.Activity`: -> config :pleroma, Pleroma.Search, module: Pleroma.Activity +> config :pleroma, Pleroma.Search, module: Pleroma.Search.DatabaseSearch While it has no external dependencies, it has problems with performance and relevancy. -- cgit v1.2.3 From 3412713c5b2fd24605b18933ef70de164ee14f2d Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Mon, 20 Dec 2021 18:16:33 +0300 Subject: Update search.md documentation with meilisearch indexing steps --- docs/configuration/search.md | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index c7e77d9c2..7dbbd3e17 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -60,6 +60,15 @@ have to do it one time, but it might take a while, depending on the amount of po consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2 million posts while idle and up to 7G while indexing initially, but your experience may be different). +The sequence of actions is as follows: + +1. First, change the configuration to use `Pleroma.Search.Meilisearch` as the search backend +2. Restart your instance, at this point it can be used while the search indexing is running, though search won't return anything +3. Start the initial indexing process (as described below with `index`), + and wait until the task says it sent everything from the database to index +4. Wait until everything is actually indexed (by checking with `stats` as described below), + at this point you don't have to do anything, just wait a while. + To start the initial indexing, run the `index` command: === "OTP" -- cgit v1.2.3 From 4f2637acc6c46ea39ae38e869903e7ffcc38b34d Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Mon, 20 Dec 2021 19:27:22 +0300 Subject: Add description for initial_indexing_chunk_size --- docs/configuration/search.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index 7dbbd3e17..a785a18ad 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -25,11 +25,15 @@ To use [meilisearch](https://www.meilisearch.com/), set the search module to `Pl > config :pleroma, Pleroma.Search, module: Pleroma.Search.Meilisearch -You then need to set the address of the meilisearch instance, and optionally the private key for authentication. +You then need to set the address of the meilisearch instance, and optionally the private key for authentication. You might +also want to change the `initial_indexing_chunk_size` to be smaller if you're server is not very powerful, but not higher than `100_000`, +because meilisearch will refuse to process it if it's too big. However, in general you want this to be as big as possible, because meilisearch +indexes faster when it can process many posts in a single batch. > config :pleroma, Pleroma.Search.Meilisearch, > url: "http://127.0.0.1:7700/", -> private_key: "private key" +> private_key: "private key", +> initial_indexing_chunk_size: 100_000 Information about setting up meilisearch can be found in the [official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html). -- cgit v1.2.3 From 1e23f527e3e22108b402552a0766e488048ed3f4 Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Tue, 22 Mar 2022 20:29:17 +0300 Subject: Change the meilisearch key auth to conform to 0.25.0 --- docs/configuration/search.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index a785a18ad..82217e5ee 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -47,15 +47,15 @@ you have to get the _private key_, which is actually used for authentication. === "OTP" ```sh - ./bin/pleroma_ctl search.meilisearch show-private-key + ./bin/pleroma_ctl search.meilisearch show-keys ``` === "From Source" ```sh - mix pleroma.search.meilisearch show-private-key + mix pleroma.search.meilisearch show-keys ``` -This is the key you actually put into your configuration file. +You will see a "Default Admin API Key", this is the key you actually put into your configuration file. ### Initial indexing -- cgit v1.2.3 From b150e6f15e0f06c8e23c0ac66aeaf80eb2f8c31a Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Wed, 23 Mar 2022 11:36:01 +0300 Subject: Update meilisearch docs --- docs/configuration/search.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index 82217e5ee..f131948a7 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -37,8 +37,10 @@ indexes faster when it can process many posts in a single batch. Information about setting up meilisearch can be found in the [official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html). -You probably want to start it with `MEILI_NO_ANALYTICS=true` and `MEILI_NO_CENTRY=true` environment variables, -to disable analytics. +You probably want to start it with `MEILI_NO_ANALYTICS=true` environment variable to disable analytics. +At least version 0.25.0 is required, but you are strongly adviced to use at least 0.26.0, as it introduces +the `--enable-auto-batching` option which drastically improves performance. Without this option, the search +is hardly usable on a somewhat big instance. ### Private key authentication (optional) -- cgit v1.2.3 From 017e35fbf128d47c033275a70b76b72f24d7c754 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?marcin=20miko=C5=82ajczak?= Date: Thu, 28 Dec 2023 00:15:32 +0100 Subject: Fix some more typos MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: marcin mikołajczak --- docs/configuration/search.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'docs/configuration/search.md') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index f131948a7..0316c9bf4 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -38,7 +38,7 @@ indexes faster when it can process many posts in a single batch. Information about setting up meilisearch can be found in the [official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html). You probably want to start it with `MEILI_NO_ANALYTICS=true` environment variable to disable analytics. -At least version 0.25.0 is required, but you are strongly adviced to use at least 0.26.0, as it introduces +At least version 0.25.0 is required, but you are strongly advised to use at least 0.26.0, as it introduces the `--enable-auto-batching` option which drastically improves performance. Without this option, the search is hardly usable on a somewhat big instance. @@ -61,7 +61,7 @@ You will see a "Default Admin API Key", this is the key you actually put into yo ### Initial indexing -After setting up the configuration, you'll want to index all of your already existsing posts. Only public posts are indexed. You'll only +After setting up the configuration, you'll want to index all of your already existing posts. Only public posts are indexed. You'll only have to do it one time, but it might take a while, depending on the amount of posts your instance has seen. This is also a fairly RAM consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2 million posts while idle and up to 7G while indexing initially, but your experience may be different). -- cgit v1.2.3