You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is rare and intermittent but there are times when the monitoring portal in Azure Synapse will misbehave and will not show me the details about a completed spark job. Instead, it displays an error message that says "Fetching Failed". Screenshot.
I have not yet found a pattern or explanation. I reported the problem to CSS support but they are not yet familiar with the error. I suspect it is a timeout on an internal resource, like a spark history server or something like that.
I realize that some parts of the Synapse platform are proprietary but it borrows significantly from OSS spark. Does anyone have an idea what might take so long, when retrieving the U/I for a completed livy batch? Is it Azure storage accounts that are performing badly? Or is it a "spark history server"? Is there any reason why they wouldn't wait indefinitely for a response (eg. ten mins)? Whenever this happens the U/I seems to fail after a short period of time (only ~60 seconds or so). I haven't found any other patterns. As you can see above, the error message is nothing more than a small tooltip shown in the upper right of the screen; when I shared with CSS they weren't able to provide any additional guidance or explanation. So I'm hoping there are synapse users on stack overflow who have encountered this.
Side: When things are working properly, the spark job is
presented with the related jobs/stages/tasks/logs like so:
The text was updated successfully, but these errors were encountered:
It is rare and intermittent but there are times when the monitoring portal in Azure Synapse will misbehave and will not show me the details about a completed spark job. Instead, it displays an error message that says "Fetching Failed". Screenshot.
I have not yet found a pattern or explanation. I reported the problem to CSS support but they are not yet familiar with the error. I suspect it is a timeout on an internal resource, like a spark history server or something like that.
I realize that some parts of the Synapse platform are proprietary but it borrows significantly from OSS spark. Does anyone have an idea what might take so long, when retrieving the U/I for a completed livy batch? Is it Azure storage accounts that are performing badly? Or is it a "spark history server"? Is there any reason why they wouldn't wait indefinitely for a response (eg. ten mins)? Whenever this happens the U/I seems to fail after a short period of time (only ~60 seconds or so). I haven't found any other patterns. As you can see above, the error message is nothing more than a small tooltip shown in the upper right of the screen; when I shared with CSS they weren't able to provide any additional guidance or explanation. So I'm hoping there are synapse users on stack overflow who have encountered this.
The text was updated successfully, but these errors were encountered: