Source map not in utf8 encoding

langer8191 · December 6, 2016, 12:39am

We are getting the following message in our events.

Source file was not ‘utf8’ encoding: https://d2te007166g1k5.cloudfront.net/assets/ray-424d6b63283b18d7f8e4.js.map
{
“url”: “https://d2te007166g1k5.cloudfront.net/assets/ray-424d6b63283b18d7f8e4.js.map”,
“value”: “utf8”
}

This seems to prevent source maps from being used for our errors. I’m confused, because I believe that the file is encoded correctly.

What causes this error? How does sentry check for utf8 compliance?

Cheers,

Lang

benvinegar · December 13, 2016, 5:20am

@langer8191 – this typically means you’re trying to upload a gzipped file. What command are you using to upload this?

What causes this error? How does sentry check for utf8 compliance?

Here’s the code.

naw · January 24, 2017, 10:31pm

We are having a similar issues with source maps scraped by Sentry.

Sentry reports that our source file was not utf8.

However, it is valid utf8

Uploading the map directly to Sentry with the API works fine. The problem only arises when it is scraped.
Looking at the code that @benvinegar linked to, loading the file directly into a python repl and checking the contents, I see that the contents are indeed six.binary_type, and can be successfully decoded as utf8, so it’s unusual that the file is actually making it way into that code branch.
Running iconv -f utf-8 against the file is successful.

Is it possible Sentry is incorrectly raising a utf8 error when the problem is something else? Maybe the file isn’t downloaded completely or something?

matt · January 25, 2017, 5:51am

We also pull this information out of the HTTP headers in the response. Can you share a link to one of these assets? If it’s on sentry.io, we can also help in support and check there.

matt · January 25, 2017, 5:53am

Also, I checked the URL you first posted, and it 404s.

naw · January 25, 2017, 11:48pm

@matt thanks for mentioning the headers.

Sentry stumbles on my files when the content header is as follows:

Content-Type: text/plain

Sentry does fine when I force the content type header as follows:

Content-Type: text/plain; charset=utf-8

In either case, the file is valid utf-8.

Is this the expected behavior?

matt · January 26, 2017, 12:05am

We did this because I’m pretty sure without the charset being utf-8, it’s handled as ascii. Thought I think it might be safe to not be so strict since I’m pretty sure that ascii is just a safe subset of utf-8.

@mitsuhiko is that correct?

naw · January 26, 2017, 12:18am

Suggestion:

Decode it as utf8 (when not specified) and trap decode errors. This way, valid utf-8 will work even if it isn’t explicitly stated to be utf8.

mathieumg · February 9, 2017, 7:31pm

We’re having the same issue, here is an example asset: https://s3.amazonaws.com/auctionex-website/assets/common.7aa9c65f2166f124023f.js.map

@naw’s suggestion seems like a sensible approach.

matt · February 10, 2017, 3:11am

I’ll try and hack through this very soon. It shouldn’t be hard to implement

matt · February 22, 2017, 1:36pm

I threw up a pull request with more information: https://github.com/getsentry/sentry/pull/4960

This problem surfaces when your server is sending back a text/* Content-Type without a charset. In which case were were getting an explicit ISO-8859-1 encoding value whereas we expected either None or utf-8. So the PR is explicitly allowing this charset since it’s a strict subset of utf-8 therefore fully compatible.

Topic		Replies	Views
Sourcemap Encoding Issue	1	975	February 1, 2017
Sourcemaps failing On-Premise	0	1672	November 28, 2017
Source code was not found for https://browser.sentry-cdn.com/4.0.5/bundle.min.js	2	1697	September 27, 2018
Base64 encoded Source Maps support	4	3298	December 22, 2016
Source Maps not Loading for JS Blobs On-Premise	3	1566	April 6, 2020

Source map not in utf8 encoding

Related topics