Commit d2271c56 authored by Andrew Newdigate's avatar Andrew Newdigate Committed by John Skarbek

Fix Kibana links

parent 637cd0f2
......@@ -11,6 +11,7 @@ test:
- /prometheus/promtool check rules alerts/*.yml
- /prometheus/promtool check rules recordings/*.yml
- /prometheus/promtool check rules rules/*.yml
- scripts/validate_kibana_urls
deploy_elastic_watcher_updates:
stage: deploy
......
......@@ -64,7 +64,6 @@ The aim of this project is to have a quick guide of what to do when an emergency
* [HAProxy is missing workers](troubleshooting/chef.md)
* [Worker's root filesystem is running out of space](troubleshooting/filesystem_alerts.md)
* [Azure Load Balancers Misbehave](troubleshooting/load-balancer-outage.md)
* [Kibana is down](troubleshooting/kibana_is_down.md)
* [GitLab registry is down](troubleshooting/gitlab-registry.md)
* [Sidekiq stats no longer showing](troubleshooting/sidekiq_stats_no_longer_showing.md)
* [Gemnasium is down](troubleshooting/gemnasium_is_down.md)
......
......@@ -75,7 +75,7 @@ groups:
minutes
description: 'Hey <!subteam^S940BK2TV|cicdops>! This may suggest problems with our autoscaled machines fleet OR
abusive usage of Runners. Check https://dashboards.gitlab.net/dashboard/db/ci
and https://log.gitlap.com/app/kibana#/dashboard/5d3921f0-79e0-11e7-a8e2-f91bfad41e34'
and https://log.gitlap.net/app/kibana#/dashboard/5d3921f0-79e0-11e7-a8e2-f91bfad41e34'
- alert: CICDRunnersManagerDown
expr: up{job=~"private-runners|shared-runners|shared-runners-gitlab-org|staging-shared-runners"} == 0
......
......@@ -8,7 +8,7 @@ one build before it is destroyed.
## Do a search through logs
First, let's log into https://log.gitlap.com/
First, let's log into https://log.gitlab.net/
To find if the IP was used by Runner:
......
### How to run commands for ES
1. ES on `log-esX` is accessible from the logstash node (log.gitlap.com).
1. ES on `log-esX` is accessible from the logstash node (log.gitlab.net).
1. Run Elastic API command against any of the ES instance. Since our ES instances in one cluster, you can run your query against any of the instance, result will be the same.
### How to check cluster health
......
......@@ -2,7 +2,7 @@
## Kibana URL
Kibana can be reached on https://log.gitlap.com
Kibana can be reached on https://log.gitlab.net
Before providing screens/information from Kibana, set/check that your timezone in Kibana is UTC. It will be easier to understand provided information for you and other team members. Timezone can be set in `Settings->Advanced->dateFormat:tz->UTC`.
......
#!/bin/sh
ROOT_DIR="$( cd "$( dirname "$0" )/.." && pwd )"
BAD_URLS=$(find "${ROOT_DIR}" -type f -name '*.md' -print0 | xargs -0 grep -Eo 'log.gitlab.net/app/kibana\S*_g=\S+_a=\S+')
if [ -n "${BAD_URLS}" ]; then
echo "Please convert your Kibana URLS to shortlinks using the Share | Shortlink menu in Kibana"
echo "The following incorrect URLs were detected:"
echo "${BAD_URLS}"
exit 1
fi
# SSL Certificate expiring or expired
## First and foremost
*Don't Panic*
## Symptoms
You're seeing alerts like
```
@channel Elasticsearch on log-es3.gitlap.com is down
```
## Possible checks
1. Login to the corresponding node
1. Check elastic with `curl http://localhost:9200`
## Resolution
Restart elasticsearch service with the following command on corresponding node:
```
sudo service elasticsearch restart
```
### Verify that elasticsearch is started
1. Check that the service is started `sudo service elasticsearch status`
1. Check http endpoint with the `curl http://localhost:9200`
1. Verify that the cluster is operable - `curl http://localhost:9200/_cluster/health?pretty`
1. Verify that that there is no `initializing_shards` and `unassigned_shards`. Cluster not operable at that moment and you should only wait while all shards moved to the `active` state. You can check it with the `curl http://localhost:9200/_cat/shards?v`.
1. Verify that the cluster in `green` status. Otherwise - start elasticsearch for all nodes. Alerts for corresponding nodes will be alerted too.
## Notes
* There is only `log-es(2|3|4).gitlap.com` nodes.
[ELK performance dashboard]: https://dashboards.gitlab.net/dashboard/db/elk-stats?orgId=1
......@@ -15,7 +15,7 @@
## 2. Check the Gitaly Logs
- Check [Sentry](https://sentry.gitlab.net/gitlab/gitaly-production/) for unusual errors
- Check [Kibana](https://log.gitlap.com/goto/5347dee91b984026567bfa48f30c38fb) for increased error rates
- Check [Kibana](https://log.gitlab.net/goto/4f0bd7f08b264e7de970bb0cc9530f9d) for increased error rates
- Check the Gitaly service logs on the affected host
- Check [Grafana dashboards](https://dashboards.gitlab.net/dashboard/db/gitaly-nfs-metrics-per-host?orgId=1) to check for a cause of this outage
......
......@@ -18,7 +18,7 @@ only apparent to some users but not others.
curl -v https://pages.gitlab.io > /dev/null
```
1. Go to https://log.gitlap.com and look for that request. You can search in Kibana for the terms:
1. Go to https://log.gitlab.net and look for that request. You can search in Kibana for the terms:
```
......
## Reason
Kibana service is not running.
## Possible checks
1. Open log.gitlap.com and login - you can see 502 error
1. SSH to the `log.gitlap.com` and if in results of `sudo service kibana status` service should be `active (running)`. Otherwise it is down.
## Fix
1. SSH to the `log.gitlap.com`.
2. Restart service with the `sudo service kibana restart`.
3. Check service with the `sudo service kibana status`.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment