Fingerprinting with whitelisted stacktrace class and line number "path"

Prior to sentry, we created a hash of a Java throwable stacktrace class+lineno path from a set of whitelisted packages. This worked pretty well to group stacktrace reports into collections. Was looking at Raven and was planning on subclassing SentryAppender to create a fingerprint from any throwable that was present on the log4j log event. This would then be given to the Sentry event via setFingerprint.

Any easier way to do this? Seems like something others would like to do.

I guess this is a long winded way of saying whitelisting packages for stacktrace grouping would be swell :slight_smile:

Hey ehthayer, as we discussed on the raven-java issue tracker I plan to add package whitelisting to the SDK soon. That said, it was intended for display purposes in the app (in_app true vs false) but not necessarily fingerprinting – we currently default to letting the Sentry server handle fingerprinting. I need to see how that’s handled, maybe this will help there too.

I implemented a log4j2 filter that added a fingerprint to the log context and we have a custom raven event builder that sets the fingerprint based on the one in the log context. So far, the teams satisfied with it.

Here’s the fingerprint generator in case it works for someone else:

package com.cargurus.util.errorlogger;

import java.io.UnsupportedEncodingException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;

import com.google.common.base.Strings;


/**
 * Create a generatedBy hash for error logging.  Essentially a hash of the stacktrace path through
 * our codebase for an exception or the current log.error() (if no exception is given).
 */
public final class StacktraceUtils {
    private static final String OUR_CLASS_PREFIX = "com.cargurus";

    private StacktraceUtils() {

    }

    /** Create a generated by hash from the exception (if available) or the current thread
     * stacktrace. */
    public static MessageDigest getGeneratedBy(Throwable exception) {
        MessageDigest generatedBy;
        try {
            generatedBy = MessageDigest.getInstance("MD5");
            generatedBy.reset();
        } catch (NoSuchAlgorithmException e) {
            // MD5 is required; would be super surprised if this happened.
            generatedBy = null;
        }

        if (exception != null) {
            hashException(generatedBy, exception);
        } else {
            hashStackTrace(generatedBy, Thread.currentThread().getStackTrace());
        }
        return generatedBy;
    }

    private static void hashException(MessageDigest generatedBy, Throwable throwable) {
        if (throwable.getCause() != null) {
            hashException(generatedBy, throwable.getCause());
        }
        hashStackTrace(generatedBy, throwable.getStackTrace());
    }

    /**
     * Create a digest of a sequence of keys created from the stacktrace elements. Each
     * key contains the class and method of the stacktrace element filtered to just our
     * classes */
    private static void hashStackTrace(MessageDigest md, StackTraceElement[] stElements) {
        for (StackTraceElement ste : stElements) {
            if (ste == null) {
                continue;
            }

            if (isOurClass(ste)) {
                StringBuilder key = new StringBuilder();

                key.append(ste.getClassName());
                key.append(".");
                key.append(Strings.nullToEmpty(ste.getMethodName());

                // deal w/ java compiler decoration of lambdas
                String stableKey = key.toString().replaceAll("\\$[\\d/]+", "\\$");
                try {
                    md.update(stableKey.getBytes("UTF-8"));
                } catch (UnsupportedEncodingException e) {
                    // pretty sure not
                }
            }
        }
    }

    private static boolean isOurClass(StackTraceElement ste) {
        return ste.getClassName().startsWith(OUR_CLASS_PREFIX);
    }
}

Line number was a poor first choice, but method name is more stable across refactorings and reformattings.