Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functionality to fix a Corrupted ElasticSearch Index #1048

Open
bodo-hugo-barwich opened this issue Dec 18, 2021 · 18 comments
Open

Functionality to fix a Corrupted ElasticSearch Index #1048

bodo-hugo-barwich opened this issue Dec 18, 2021 · 18 comments

Comments

@bodo-hugo-barwich
Copy link
Contributor

As documented at
MetaCPAN API - Indexing failed
ElasticSearch indices could result corrupted for different reasons.
Often they can be fixed by re-creating the index.
But as seen in the development at
ElasticSearch Availabilty Check and Mapping Self-Check
there are still a lot of open questions about how to implement it exactly.

@bodo-hugo-barwich
Copy link
Contributor Author

During the testing it turned out that the Index Corruption usually happens when a beginning test is interrupted forcefully:

$ docker-compose exec -T api_test prove -lr --jobs 2 t
%hshtestindices{...} in scalar context better written as $hshtestindices{...} at t/lib/MetaCPAN/TestServer.pm line 193.
	require MetaCPAN/TestServer.pm called at t/00_setup.t line 15
	main::BEGIN() called at t/lib/MetaCPAN/TestServer.pm line 193
	eval {...} called at t/lib/MetaCPAN/TestServer.pm line 193
^CERROR: Aborting.
$ docker-compose exec -T api_test prove -lr --jobs 2 t
[Request] ** [http://elasticsearch_test:9200]-[400] [invalid_alias_name_exception] Invalid alias name [cpan], an index exists with the same name as the alias, with: {"index":"cpan_v1_01"}, called from sub MetaCPAN::Script::Mapping::deploy_mapping at /metacpan-api/lib/MetaCPAN/Script/Mapping.pm line 142. With vars: {'status_code' => 400,'request' => {'serialize' => 'std','path' => '/cpan_v1_01/_alias/cpan','ignore' => [],'method' => 'PUT','body' => undef,'qs' => {}},'body' => {'error' => {'reason' => 'Invalid alias name [cpan], an index exists with the same name as the alias','index' => 'cpan_v1_01','root_cause' => [{'type' => 'invalid_alias_name_exception','index' => 'cpan_v1_01','reason' => 'Invalid alias name [cpan], an index exists with the same name as the alias'}],'type' => 'invalid_alias_name_exception'},'status' => 400}}
 at /usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Util.pm line 71.
	Search::Elasticsearch::Util::throw("Request", "[http://elasticsearch_test:9200]-[400] [invalid_alias_name_ex"..., HASH(0x559331fac388)) called at /usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Role/Cxn.pm line 172
	Search::Elasticsearch::Role::Cxn::process_response(Search::Elasticsearch::Cxn::HTTPTiny=HASH(0x5593392c9008), HASH(0x5593393637d0), 400, "Bad Request", "{\"error\":{\"root_cause\":[{\"type\":\"invalid_alias_name_exception"..., "application/json") called at /usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Role/Cxn/HTTP.pm line 135
	Search::Elasticsearch::Role::Cxn::HTTP::__ANON__[/usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Role/Cxn/HTTP.pm:136](CODE(0x559331fabdb8), Search::Elasticsearch::Cxn::HTTPTiny=HASH(0x5593392c9008), HASH(0x5593393637d0), 400, "Bad Request", "{\"error\":{\"root_cause\":[{\"type\":\"invalid_alias_name_exception"..., HASH(0x559339373e38)) called at (eval 2255)[/usr/local/lib/perl5/site_perl/5.30.1/Class/Method/Modifiers.pm:89] line 1
	Search::Elasticsearch::Cxn::HTTPTiny::__ANON__[(eval 2255)[/usr/local/lib/perl5/site_perl/5.30.1/Class/Method/Modifiers.pm:89]:1](Search::Elasticsearch::Cxn::HTTPTiny=HASH(0x5593392c9008), HASH(0x5593393637d0), 400, "Bad Request", "{\"error\":{\"root_cause\":[{\"type\":\"invalid_alias_name_exception"..., HASH(0x559339373e38)) called at (eval 2257)[/usr/local/lib/perl5/site_perl/5.30.1/Class/Method/Modifiers.pm:148] line 2
	Search::Elasticsearch::Cxn::HTTPTiny::process_response(Search::Elasticsearch::Cxn::HTTPTiny=HASH(0x5593392c9008), HASH(0x5593393637d0), 400, "Bad Request", "{\"error\":{\"root_cause\":[{\"type\":\"invalid_alias_name_exception"..., HASH(0x559339373e38)) called at /usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Cxn/HTTPTiny.pm line 42
	Search::Elasticsearch::Cxn::HTTPTiny::perform_request(Search::Elasticsearch::Cxn::HTTPTiny=HASH(0x5593392c9008), HASH(0x5593393637d0)) called at (eval 2251)[/usr/local/lib/perl5/site_perl/5.30.1/Class/Method/Modifiers.pm:148] line 6
	Search::Elasticsearch::Cxn::HTTPTiny::perform_request(Search::Elasticsearch::Cxn::HTTPTiny=HASH(0x5593392c9008), HASH(0x5593393637d0)) called at /usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Transport.pm line 29
	Search::Elasticsearch::Transport::try {...} () called at /usr/local/lib/perl5/site_perl/5.30.1/Try/Tiny.pm line 102
	eval {...} called at /usr/local/lib/perl5/site_perl/5.30.1/Try/Tiny.pm line 93
	Try::Tiny::try(CODE(0x5593393597e8), Try::Tiny::Catch=REF(0x559339363b18)) called at /usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Transport.pm line 41
	Search::Elasticsearch::Transport::perform_request(Search::Elasticsearch::Transport=HASH(0x559330c17938), HASH(0x559339374210)) called at /usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Role/Client.pm line 16
	Search::Elasticsearch::Role::Client::perform_request(Search::Elasticsearch::Client::2_0::Direct::Indices=HASH(0x5593393687d0), HASH(0x55933935c0e0), "index", "cpan_v1_01", "name", "cpan") called at /usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Role/Client/Direct.pm line 103
	Search::Elasticsearch::Role::Client::Direct::__ANON__[/usr/local/lib/perl5/site_perl/5.30.1/Search/Elasticsearch/Role/Client/Direct.pm:104](Search::Elasticsearch::Client::2_0::Direct::Indices=HASH(0x5593393687d0), "index", "cpan_v1_01", "name", "cpan") called at /metacpan-api/lib/MetaCPAN/Script/Mapping.pm line 491
	MetaCPAN::Script::Mapping::deploy_mapping(MetaCPAN::Script::Mapping=HASH(0x5593392ff0f8)) called at /metacpan-api/lib/MetaCPAN/Script/Mapping.pm line 142
	MetaCPAN::Script::Mapping::run(MetaCPAN::Script::Mapping=HASH(0x5593392ff0f8)) called at /usr/local/lib/perl5/site_perl/5.30.1/x86_64-linux-gnu/Class/MOP/Method/Wrapped.pm line 44
	MetaCPAN::Script::Mapping::_wrapped_run(MetaCPAN::Script::Mapping=HASH(0x5593392ff0f8)) called at /usr/local/lib/perl5/site_perl/5.30.1/x86_64-linux-gnu/Class/MOP/Method/Wrapped.pm line 95
	MetaCPAN::Script::Mapping::run(MetaCPAN::Script::Mapping=HASH(0x5593392ff0f8)) called at t/lib/MetaCPAN/TestServer.pm line 220
	MetaCPAN::TestServer::put_mappings(MetaCPAN::TestServer=HASH(0x55932b7b3228)) called at t/lib/MetaCPAN/TestServer.pm line 65
	MetaCPAN::TestServer::setup(MetaCPAN::TestServer=HASH(0x55932b7b3228)) called at t/00_setup.t line 38
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 255 just after 2.
t/00_setup.t ............................... 
Dubious, test returned 255 (wstat 65280, 0xff00)
All 2 subtests passed 
^CERROR: Aborting.

Interrupting the Indexation Sequence can easily produce a corrupted index.

@bodo-hugo-barwich
Copy link
Contributor Author

Interrupting the t/00_setup.t can also produce corrupted test data which then produces unexplainable fails in subsequent tests:

$ docker-compose exec -T api_test prove -vl t/server/controller/user/favorite.t

#   Failed test 'user looks like a bot'
#   at t/server/controller/user/favorite.t line 74.

#   Failed test 'forbidden'
#   at t/server/controller/user/favorite.t line 89.
#          got: '201'
#     expected: '403'
# Looks like you failed 2 tests of 23.
t/server/controller/user/favorite.t .. 
ok 1 - get user
ok 2 - code 200
ok 3 - valid json
ok 4 - got correct identity
ok 5 - got correct access_token
ok 6 - POST favorite
ok 7 - status created
ok 8 - location header set
ok 9 - GET http://localhost/favorite/AX8hCX3xPSUPrewe6jRd/Moose
ok 10 - found
ok 11 - valid json
ok 12 - user is AX8hCX3xPSUPrewe6jRd
ok 13 - DELETE /user/favorite/MO/Moose
ok 14 - status ok
ok 15 - GET http://localhost/favorite/AX8hCX3xPSUPrewe6jRd/Moose
ok 16 - not found
ok 17 - get bot
ok 18 - code 200
ok 19 - valid json
usr dmp:
{
  access_token => [{ client => "testing", token => "bot" }],
  id => "AX8hCX7QPSUPrewe6jRe",
  identity => [],
  looks_human => bless(do{\(my $o = 1)}, "JSON::PP::Boolean"),
  passed_captcha => "2022-02-22T10:52:22",
}
not ok 20 - user looks like a bot
ok 21 - POST favorite
ok 22 - valid json
not ok 23 - forbidden
1..23
Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/23 subtests 

Test Summary Report
-------------------
t/server/controller/user/favorite.t (Wstat: 512 Tests: 23 Failed: 2)
  Failed tests:  20, 23
  Non-zero exit status: 2
Files=1, Tests=23,  6 wallclock secs ( 0.02 usr  0.00 sys +  4.76 cusr  0.17 csys =  4.95 CPU)
Result: FAIL

@oalders
Copy link
Member

oalders commented Mar 15, 2022

@bodo-hugo-barwich where are we at on this?

@bodo-hugo-barwich
Copy link
Contributor Author

bodo-hugo-barwich commented Mar 15, 2022

As documented in this task an index corruption can easily happen in the development process.
When this incident happens it only can be fixed manually to recover operativity.
Now the established procedure is as documented in the related issue:
Manually deleting a corrupted index

$ docker-compose exec api_test /bin/bash
root@6133962c5f5d:/metacpan-api# curl -XGET 'elasticsearch_test:9200/_cat/indices'
yellow open cover       1 1  0 0   159b   159b 
yellow open cpan_v1_01  1 1  0 0   159b   159b 
yellow open contributor 1 1  0 0   159b   159b 
yellow open cpan        5 1 11 5 79.9kb 79.9kb 
yellow open user        1 1  0 0   159b   159b 
root@6133962c5f5d:/metacpan-api# curl -XDELETE 'elasticsearch_test:9200/cpan'
{"acknowledged":true}

Alternatively exists also a MetaCPAN::Script::Mapping command that can be executed like this:

$ docker-compose exec api_test bin/metacpan mapping --delete_index cpan

but also this procedure requires that the developer already knows that the cause is the corrupted index and that he needs to delete the wrongly created cpan index.

The suggestion was to introduce a simplified command like:

$ docker-compose exec api_test index-cpan.sh --all

But in the conversation on the development at:
ElasticSearch Mapping Self-Check
I understood that this is not desirable.

Alternatively another approach could also be:

$ docker-compose exec api_test index-cpan.sh --delete cpan

But I'm not sure if this is an actual need of the development team.

@oalders
Copy link
Member

oalders commented Mar 15, 2022

I understood that this is not desirable.

I think the main concern is that we don't want to add any flags to destroy the index which would accidentally be passed through to a production instance. I'm fine with making it easier to recover from a corrupted Elasticsearch in the developer environment as long as it's restricted to that environment.

@mohawk2
Copy link
Contributor

mohawk2 commented Mar 16, 2022

Is there a way to reliably detect an index has been corrupted?

@bodo-hugo-barwich
Copy link
Contributor Author

bodo-hugo-barwich commented Mar 16, 2022

During the development I saw that an index can corrupt within the ElasticSearch Engine when the indexation process is canceled unexpectedly. Then the index shows with Health State red within the ElasticSearch Engine.
In this sense the cpan index is not corrupted in regards to ElasticSearch but it is wrongly built and incompatible with the application.
As documented in the related issue:
cpan index is wrongly built
The issue with the cpan index is that it should not exist and it hinders the application to insert the package data correctly. cpan is supposed to be the Alias of the correct Index cpan_v1_01. Since the cpan index should not exist the application will not remove it because it is unaware of it. In that sense the application will constantly fail and the MetaCPAN API will always stay empty and cannot recover by itself.

Now, this task on hand aims to remove any wrongly built indices to break this doom loop.

@mohawk2
Copy link
Contributor

mohawk2 commented Mar 17, 2022

I understand there are different ways for the data or index to be in a useless or wrong state. I'm asking if there's a way to reliably detect any or all of them, partly so the tests could immediately bail out in that instance. It was a simple yes/no question, and I don't understand a yes or no from this yet :-)

@bodo-hugo-barwich
Copy link
Contributor Author

Yes, you can check the current index structure against a desired index structure.

Right now there is a test in t/00_setup.t which is part of the MetaCPAN::TestServer module which does this as part of the overall test run.
The test method MetaCPAN::TestServer::verify_mappings() does this verification
Verify Index Structue Test

This test gives a 98% certainty that the ElasticSearch infrastructure is correct. (2% that are not covered are changes of the intern index fields.)

This kind of check does also the MetaCPAN::Script::Mapping module during the setup phase.
Index Verification during Setup

So, when the ./bin/run bin/metacpan mapping --delete finishes correctly without reporting any errors also the ElasticSearch infrastructure setup is correct.

Still the test 00_setup.t and also the mapping command delete the previous index structure and re-create it. This might not be desirable when a developer works on a structure change in the index.
A manual verification command on demand is not implemented yet because the need was not clear at that point. A new mapping --verify command could provide a non-destructive index verification.

@bodo-hugo-barwich
Copy link
Contributor Author

bodo-hugo-barwich commented Mar 17, 2022

I'm fine with making it easier to recover from a corrupted Elasticsearch in the developer environment as long as it's restricted to that environment.

I understand this requirement as that the application needs to be aware of its environment where it runs in. An Environment Variable could provide this information and only allow the --delete --all functionality when this variable is set correctly. Otherwise it must inform of an Error and exit with an Error Code.

@mohawk2
Copy link
Contributor

mohawk2 commented Mar 18, 2022

I'd like to see something like an index_is_valid function that can correctly answer that question - if it were possible to make it, then it should be equally applicable to both dev and production. It would have to deal with the 2% of cases Bodo mentions as not being dealt with yet.

@bodo-hugo-barwich
Copy link
Contributor Author

bodo-hugo-barwich commented Mar 20, 2022

From the technical side the Search::Elasticsearch::Client exposes a sufficiently detailed API to be able to actually implement a thorough Mapping Check with the methods Search::Elasticsearch::Client::2_0::Direct::Indices::get_mapping() and Search::Elasticsearch::Client::2_0::Direct::Indices::get_aliases().
From the view point of correctness this will be able to detect any accidental or engine imposed automated changes.
Also this is a non-destructive check which does not alter the actual content of the indices.
This check might be interesting to run at the end of the index creation sequence.

But from the operational view point this will be an expensive revision which when abused can result into a noticeable slow down.
To understand the dimension of the data the complete mapping of the cpan_v1_01 index is a 14KB strong JSON document:

root@1eda1969bdfb:/metacpan-api# curl -v elasticsearch:9200/cpan_v1_01/_mapping|jq  
# [...]
* Connected to elasticsearch (192.168.48.2) port 9200 (#0)
> GET /cpan_v1_01/_mapping HTTP/1.1
> Host: elasticsearch:9200
> User-Agent: curl/7.64.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 14228
< 

Due to this concern a light weight quick check was implemented in MetaCPAN::Script::Mapping::verify_mapping()

So, the Full Check could be an additional functionality to be used at certain points within the installation process.

@bodo-hugo-barwich
Copy link
Contributor Author

I saw in the code that the environment variables MOJO_MODE and PLACK_ENV are used to recognize the operation environment.
In the Docker environment MOJO_MODE has the values development or testing.
So, these values will indicate a Development Environment and enable the --delete --all operation.
While the missing of those values will result into an exception *** EXECPTION [ 1 ] ***: Operation not permitted in environment: production as seen in the test that I created in t/lib/MetaCPAN/TestServer.pm because the --delete --all operation needs to be tested before the Test Environment Setup.

# Subtest: delete all not permitted
start test '--delete --all'
    ok 1 - delete all fails
    ok 2 - Exit Code '1' - operation failed
mapping --delete --all - STDOUT:
'test '--delete --all' - ENV dmp:
(
  "PERL5LIB",
  "/metacpan-api/lib",
  "TEST2_ACTIVE",
  1,
  "HOME",
  "/root",
  "HARNESS_ACTIVE",
  1,
  "TEST_VERBOSE",
  1,
  "MINICPAN",
  "/CPAN",
  "HARNESS_IS_VERBOSE",
  1,
  "TEST_ACTIVE",
  1,
  "MOJO_MODE",
  "testing",
  "PATH",
  "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
  "PERL_MM_USE_DEFAULT",
  1,
  "HOSTNAME",
  "886ac8899f64",
  "TAP_VERSION",
  13,
  "PERL_LWP_SSL_VERIFY_HOSTNAME",
  0,
  "METACPAN_SERVER_CONFIG_LOCAL_SUFFIX",
  "testing",
  "PERL_CARTON_PATH",
  "/carton",
  "ES_TEST",
  "elasticsearch_test:9200",
  "HARNESS_VERSION",
  3.42,
  "ES",
  "elasticsearch_test:9200",
  "NET_ASYNC_HTTP_MAXCONNS",
  1,
  "COLUMNS",
  80,
)
delete_all() - ENV dmp:
(
  "HARNESS_ACTIVE",
  1,
  "TEST2_ACTIVE",
  1,
  "HARNESS_VERSION",
  3.42,
  "COLUMNS",
  80,
  "NET_ASYNC_HTTP_MAXCONNS",
  1,
  "PERL_CARTON_PATH",
  "/carton",
  "PERL_LWP_SSL_VERIFY_HOSTNAME",
  0,
  "TAP_VERSION",
  13,
  "PERL_MM_USE_DEFAULT",
  1,
  "TEST_ACTIVE",
  1,
  "HARNESS_IS_VERBOSE",
  1,
  "MINICPAN",
  "/CPAN",
  "HOME",
  "/root",
  "TEST_VERBOSE",
  1,
  "PERL5LIB",
  "/metacpan-api/lib",
  "ES",
  "elasticsearch_test:9200",
  "METACPAN_SERVER_CONFIG_LOCAL_SUFFIX",
  "testing",
  "ES_TEST",
  "elasticsearch_test:9200",
  "HOSTNAME",
  "886ac8899f64",
  "PATH",
  "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
)
*** ERROR ***: Operation not permitted!
*** EXECPTION [ 1 ] ***: Operation not permitted in environment: production at /metacpan-api/lib/MetaCPAN/Role/Script.pm line 173.
	MetaCPAN::Role::Script::handle_error(MetaCPAN::Script::Mapping=HASH(0x5653e6e4ef98), "Operation not permitted in environment: production", 1) called at /metacpan-api/lib/MetaCPAN/Script/Mapping.pm line 267
	MetaCPAN::Script::Mapping::delete_all(MetaCPAN::Script::Mapping=HASH(0x5653e6e4ef98)) called at /metacpan-api/lib/MetaCPAN/Script/Mapping.pm line 173
	MetaCPAN::Script::Mapping::run(MetaCPAN::Script::Mapping=HASH(0x5653e6e4ef98)) called at /usr/local/lib/perl5/site_perl/5.30.1/x86_64-linux-gnu/Class/MOP/Method/Wrapped.pm line 44
	MetaCPAN::Script::Mapping::_wrapped_run(MetaCPAN::Script::Mapping=HASH(0x5653e6e4ef98)) called at /usr/local/lib/perl5/site_perl/5.30.1/x86_64-linux-gnu/Class/MOP/Method/Wrapped.pm line 95
	MetaCPAN::Script::Mapping::run(MetaCPAN::Script::Mapping=HASH(0x5653e6e4ef98)) called at /metacpan-api/lib/MetaCPAN/Script/Runner.pm line 39
	MetaCPAN::Script::Runner::try {...} () called at /usr/local/lib/perl5/site_perl/5.30.1/Try/Tiny.pm line 102
	eval {...} called at /usr/local/lib/perl5/site_perl/5.30.1/Try/Tiny.pm line 93
	Try::Tiny::try(CODE(0x5653dc5f2df8), Try::Tiny::Catch=REF(0x5653e6e60268)) called at /metacpan-api/lib/MetaCPAN/Script/Runner.pm line 68
	MetaCPAN::Script::Runner::run() called at t/lib/MetaCPAN/TestServer.pm line 661
	MetaCPAN::TestServer::__ANON__[t/lib/MetaCPAN/TestServer.pm:689]() called at /usr/local/lib/perl5/5.30.1/Test/Builder.pm line 333
	eval {...} called at /usr/local/lib/perl5/5.30.1/Test/Builder.pm line 333
	Test::Builder::subtest(Test::Builder=HASH(0x5653e37f18b0), "delete all not permitted", CODE(0x5653dfb31d50)) called at /usr/local/lib/perl5/5.30.1/Test/More.pm line 809
	Test::More::subtest("delete all not permitted", CODE(0x5653dfb31d50)) called at t/lib/MetaCPAN/TestServer.pm line 689
	MetaCPAN::TestServer::test_delete_fails(MetaCPAN::TestServer=HASH(0x5653d7afbbc0)) called at t/lib/MetaCPAN/TestServer.pm line 629
	MetaCPAN::TestServer::test_delete_mappings(MetaCPAN::TestServer=HASH(0x5653d7afbbc0)) called at t/lib/MetaCPAN/TestServer.pm line 70
	MetaCPAN::TestServer::setup(MetaCPAN::TestServer=HASH(0x5653d7afbbc0)) called at t/00_setup.t line 38

'
mapping --delete --all - STDERR:
'2022/07/16 11:12:57 I mapping: Awaiting Elasticsearch ...
[2022/07/16 11:12:57] [catalyst] [INFO] Awaiting Elasticsearch ...
2022/07/16 11:12:57 I mapping: Awaiting 0 / 15 : ready
[2022/07/16 11:12:57] [catalyst] [INFO] Awaiting 0 / 15 : ready
2022/07/16 11:12:57 E mapping: Operation not permitted!
[2022/07/16 11:12:57] [catalyst] [ERROR] Operation not permitted!
2022/07/16 11:12:57 F mapping: Operation not permitted in environment: production
[2022/07/16 11:12:57] [catalyst] [FATAL] Operation not permitted in environment: production
'

@bodo-hugo-barwich
Copy link
Contributor Author

The MetaCPAN::Role::Script::are_you_sure() when not responded with the exact match of the requested confirmation will produce an exception: *** EXECPTION [ 125 ] ***: Operation canceled on User Request at /metacpan-api/bin/../lib/MetaCPAN/Script/Mapping.pm line 256. which will mostly stop any further processing.

$ docker-compose exec api bin/metacpan mapping --delete --all 
2022/07/16 11:19:55 I mapping: Awaiting Elasticsearch ...
2022/07/16 11:19:55 I mapping: Awaiting 0 / 15 : ready
delete_all() - ENV dmp:
(
  "HOME",
  "/root",
  "MOJO_MODE",
  "development",
  "TERM",
  "xterm",
  "PATH",
  "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
  "PERL_CARTON_PATH",
  "/carton",
  "ES_TEST",
  "elasticsearch_test:9200",
  "HOSTNAME",
  "6d35a76cfd95",
  "PERL_MM_USE_DEFAULT",
  1,
  "COLUMNS",
  80,
  "NET_ASYNC_HTTP_MAXCONNS",
  1,
  "ES",
  "elasticsearch:9200",
  "MINICPAN",
  "/CPAN",
)
interactive: on
*** Warning ***: ALL Indices will be deleted !!!
Are you sure you want to do this (type "YES" to confirm) ? y  
2022/07/16 11:20:09 E mapping: Confirmation incorrect: 'y'
Operation will be interruped!
2022/07/16 11:20:09 F mapping: Operation canceled on User Request
*** EXECPTION [ 125 ] ***: Operation canceled on User Request at /metacpan-api/bin/../lib/MetaCPAN/Script/Mapping.pm line 256.

@bodo-hugo-barwich
Copy link
Contributor Author

When the interactive mode is off this will not occur and the operation will proceed.

$ docker-compose exec -T api bin/metacpan mapping --delete --all 
delete_all() - ENV dmp:
(
  "HOSTNAME",
  "6d35a76cfd95",
  "NET_ASYNC_HTTP_MAXCONNS",
  1,
  "PATH",
  "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
  "ES_TEST",
  "elasticsearch_test:9200",
  "ES",
  "elasticsearch:9200",
  "MOJO_MODE",
  "development",
  "PERL_CARTON_PATH",
  "/carton",
  "HOME",
  "/root",
  "COLUMNS",
  80,
  "PERL_MM_USE_DEFAULT",
  1,
  "MINICPAN",
  "/CPAN",
)
interactive: off
*** Warning ***: ALL Indices will be deleted !!!
interactive: off
*** Warning ***: this will delete EVERYTHING and re-create the (empty) indexes

When the interactive mode is off any intent to write to the terminal will produce an exception in the application so there is no other way to handle this.

@bodo-hugo-barwich
Copy link
Contributor Author

@mohawk2
The Mappings Verification implemented at
Verify Indices Mappings
can detect missing indices like:

2022/07/30 13:37:41 E mapping: Missing index: cover
[2022/07/30 13:37:41] [catalyst] [ERROR] Missing index: cover
# [...]
2022/07/30 13:37:41 I mapping: Verification indices: failed
[2022/07/30 13:37:41] [catalyst] [INFO] Verification indices: failed
2022/07/30 13:37:41 I mapping: Verification aliases: failed
[2022/07/30 13:37:41] [catalyst] [INFO] Verification aliases: failed
2022/07/30 13:37:41 E mapping: Indices Verification has failed!
[2022/07/30 13:37:41] [catalyst] [ERROR] Indices Verification has failed!

and also differences between the mappings definition and the actually deployed mappings.

[2022/07/30 13:37:41] [catalyst] [INFO] Verifying index: cover
2022/07/30 13:37:41 E mapping: Mismatch field: cover.cover.properties.version.ignore_above (1024 <> 2048)
[2022/07/30 13:37:41] [catalyst] [ERROR] Mismatch field: cover.cover.properties.version.ignore_above (1024 <> 2048)
2022/07/30 13:37:41 E mapping: Broken index: cover (mapping does not match definition)
[2022/07/30 13:37:41] [catalyst] [ERROR] Broken index: cover (mapping does not match definition)
# [...]
2022/07/30 13:37:41 I mapping: Verification indices: failed
[2022/07/30 13:37:41] [catalyst] [INFO] Verification indices: failed
2022/07/30 13:37:41 I mapping: Verification aliases: failed
[2022/07/30 13:37:41] [catalyst] [INFO] Verification aliases: failed
2022/07/30 13:37:41 E mapping: Indices Verification has failed!
[2022/07/30 13:37:41] [catalyst] [ERROR] Indices Verification has failed!

which can be seen in the test report:

# Subtest: missing index
    # Subtest: delete cover index
*** Warning ***: Index cover will be deleted !!!
        ok 1 - deletion 'cover' succeeds
        ok 2 - Exit Code '0' - No Error
        1..2
    ok 1 - delete cover index
    # Subtest: mapping verification fails
*** ERROR ***: Indices Verification has failed!
        ok 1 - verification execution fails
        ok 2 - Exit Code '1' - Verification Error
        1..2
    ok 2 - mapping verification fails
    # Subtest: re-create cover index
        ok 1 - creation 'cover' succeeds
        ok 2 - Exit Code '0' - No Error
        1..2
    ok 3 - re-create cover index
    1..3
ok 10 - missing index
# Subtest: field mismatch
    # Subtest: mapping change field
*** Warning ***: Index cover will be updated !!!
        ok 1 - change 'cover' succeeds
        ok 2 - Exit Code '0' - No Error
        1..2
    ok 1 - mapping change field
    # Subtest: field verification fails
*** ERROR ***: Indices Verification has failed!
        ok 1 - verification fails
        ok 2 - Exit Code '1' - Verification Error
        1..2
    ok 2 - field verification fails
    # Subtest: mapping re-establish field
*** Warning ***: Index cover will be updated !!!
        ok 1 - re-establish 'cover' succeeds
        ok 2 - Exit Code '0' - No Error
        1..2
    ok 3 - mapping re-establish field
    1..3
ok 11 - field mismatch

this will give a 100% security that the deployed mappings match exactly the definitions in the project and any changes in the ElacticSearch Engine will become evident.

@mohawk2
Copy link
Contributor

mohawk2 commented Jul 31, 2022

A quick look at the PR looks really good. I feel like the check_* (and verify_*) methods would benefit from being renamed since they are predicates and therefore should be named like such, to something like is_mapping_valid (the is_ is optional, perhaps it would cause ugliness for plurals).

@bodo-hugo-barwich
Copy link
Contributor Author

According to your suggestion I removed the now unused check_mapping() method and rename the verification method to mappings_valid()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants