Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

진단 단계

Potential Triggers

  • HipChat Server was recently upgraded to HipChat Server v1.3.1 or newer.
  • The server ran low on memory —less than 5% free— and the ElasticSearch process crashed. 

Diagnosis

Log into the HipChat Server command line interface and run the following command: 

Code Block
ps aux | grep elastic

If the output does not show an active Java application process, this knowledge base article applies:

 

Code Block
titleElasticsearch Not Running
admin    29369  0.0  0.0   8108   940 pts/0    S+   22:55   0:00 grep --color=auto elastic

If the output does show an active Java application process, another problem may be present - please contact HipChat Server Support for assistance.

 

Code Block
titleElasticsearch Running
9002      2159  0.1  6.5 2125668 247796 ?      Sl   Sep11  13:19 /usr/bin/java -server -Djava.net.preferIPv4Stack=true -Des.config=/usr/local/etc/elasticsearchelasticsearch.yml -Xms739m -Xmx739m -Xss256k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCondCardMark -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled -Delasticsearch -Des.pidfile=/usr/local/var/run/elasticsearch/shanye_impen_com.pid -Des.path.home=/usr/local/elasticsearch -cp :/usr/local/elasticsearch/lib/*:/usr/local/elasticsearch/lib/sigar/* org.elasticsearch.bootstrap.ElasticSearch
admin    29369  0.0  0.0   8108   940 pts/0    S+   22:55   0:00 grep --color=auto elastic

 

원인

The ElasticSearch service has stopped and is no longer available to query for or store chat history.

임시방법

HipChat Server v1.4.x is missing Python dependencies to properly utilize fabfile.py mentioned below:

  • fabric
  • redis
  • simplejson
  • elasticsearch

The fail-over chat history log will house messages (in JSON format) that did not write to Elasticsearch when it was down. It is located at:

  • /hipchat/tetra-app/es_history_write_failures.log for 1.4.0 and above. 

  • /hipchat/tetra-app/elasticsearch_write_failures.log versions older than 1.4.0. 

The fail-over chat history log may be considerable in size, depending on the amount of time Elasticsearch has been unavailable. Use the df -h command to check disk space on the HipChat Server to determine moving the files off of the local storage volume(s) will be necessary to prevent disk space overutilization. See HipChat Server disk space is full for more details.

Before restoring messages back into history from the fail-over log, please note the following:

  • The HipChat Server instance will need outbound internet access to install Fabric, a python library.

  • Snapshot your instance prior to importing messages.

  • Backup the elasticsearch_write_failure.log file to another location in the event of failure.

  • You may want to do this off hours if this is a production instance. 

Restoring messages from elasticsearch_write_failure.log

The process below outlines the steps to import the message history (complete with timestamps) back into Elasticsearch

1) Inside the HCS console, elevate privileges 

Code Block
sudo dont-blame-hipchat
2) Upload the attached elastic_import_prereqs.sh to the server under /home/admin and run it:  
Code Block
bash elastic_import_prereqs.sh
3) Navigate to the directory where fabfile.py is located 
Code Block
cd /hipchat-scm/tetra/tools/history

4) Verify that Elasticsearch is running

Please see the Diagnosis section above.

If it is not running, please start it by running: 

Code Block
sudo service elasticsearch start

5) Import the fabfile.puy script and call the restore function passing in the path to the ES flat file.

If you're on HipChat Server 1.4.0 and up, run: 


Code Block
python -c "import fabfile; fabfile.import_local_elasticsearch_file('/hipchat/tetra-app/es_history_write_failures.log','es_history')"

else: 

 

Code Block
python -c "import fabfile; fabfile.import_local_elasticsearch_file('/hipchat/tetra-app/elasticsearch_write_failures.log')"

This will start the import process. An example of the output will look like: 

 

 

Code Block
20% - 1/5 - u'b6f21bb3-8cf8-42e1-8061-cbc933f7dc57'
40% - 2/5 - u'1cc6abbe-cb2f-4801-86d6-836301a97b56'
60% - 3/5 - u'5e3ca9fa-a8ad-4b60-9df8-a494b19fc9ca'
80% - 4/5 - u'52d8c32b-9e0a-4005-bf55-0471bc46a0ef'
100% - 5/5 - u'037c3466-e8bb-4c7e-8260-b97e029508f4'

 

Total import time is determined by how many messages listed in the fail-over log.

If you have any issues, please file a support ticket at support.atlassian.com.