Posting this later than desired - our goal will be to get initial notice posted within 15 minutes of detecting / confirming an issue.
The issue is now resolved (14:20PST) the issue affected a single switch, and was rather unusual in that at no point during the issue (excepting the period during the restart / re-registration of phones) did the switch show a markedly decreased call volume.
Our reluctance to negatively impact ongoing calls caused the issue to take a bit longer to resolve than we would like.
Timeline (UCT-8 / PST):
13:25 first 2 client reports indicated inability calling between extensions
13:40 we had tested / could reproduce the problem. Normally a light is either on or off. Working or not. However in this case, simple calls in were being processed, so were simple calls out. The issue seemed to relate to some internal functions including more advanced queues, day night switches, etc. At this point we could potentially have restarted prior to swapping to backup, however, the large number of active calls indicated the problem might be recoverable without that drastic step. So we started investigating.
13:55 all basic function tests were working, and we had identified the failing modules / components, but not the cause for the failure
14:05 the issues were determined to be connected to internal network processing on switch 5. This is a very unusual situation, and is the first time we have seen this sort of behavior. The behavior can be visualized by imagining you are in a room full of people, and you could talk to others, hear others, but could no longer hear your own voice – just like it was being filtered out.
14:11 first attempt at a non-destructive network restart – resulted in network performance degrading further including drop of some active calls and registered phones
14:15 second attempt at non-destructive network restart – no improvement.
14:17 complete restart - this process disconnects all phones and calls.
14:20 restart complete- time to test connections, and return phone calls from reporting clients.
Thank you to the clients to who reported. Anyone who opened a ticket will receive a link to this article.
At this time we found no cause, and no similar examples of an issue like this. If it were to happen again we would be quicker to recognize the issue, which would decrease the duration of the issue.
If there are any questions or comments, please let us know?
BitBlock Systems, Inc.