Martech Monitoring
Login Start Free

Category: Uncategorized

  • How to Set Up SFMC Automation Error Alerts That Actually Work

    Why Most SFMC Automation Alerts Fail Before They Start

    You’ve set up email notifications in Automation Studio. You feel covered. Then one Monday morning, you discover a nightly data sync has been failing silently for four days โ€” and your alert emails were sitting unread in a shared inbox alongside dozens of other routine notifications nobody checks anymore.

    This is the real problem with SFMC automation alerts: it’s not that the tools aren’t there, it’s that most teams configure them once and assume the job is done. Effective alerting is a system, not a checkbox. This guide walks you through building that system properly โ€” from native SFMC configuration to routing strategies that ensure the right person sees the right error at the right time.

    Understanding What SFMC Actually Gives You Natively

    Automation Studio provides built-in error notification settings at two levels: the account level and the individual automation level. Both matter, and many teams configure only one.

    Account-Level Notification Settings

    In Setup, under Automation Studio Settings, you can define a default notification email address that receives alerts whenever any automation in your account encounters an error. This is a useful catch-all, but it’s also where alert fatigue begins if you’re not careful. Every skipped record warning, every benign timeout retry, every low-severity issue lands in the same inbox as your critical payment data imports.

    Navigate here via: Setup โ†’ Platform Tools โ†’ Apps โ†’ Automation Studio โ†’ Settings. The field labeled Error Notifications accepts a single email address or distribution list. Use a distribution list โ€” never a single person’s inbox โ€” so coverage survives vacations and role changes.

    Automation-Level Notifications

    Inside each individual automation, the Notifications tab lets you configure email alerts specific to that workflow. You can set recipients for both errors and skipped records separately. This granularity is powerful and underused. A high-stakes revenue reporting automation should notify your senior data engineer directly. A low-priority preference center sync can notify a shared team alias. Map your notification recipients to the business criticality of the automation, not just who built it.

    The Four Failure Modes You Need to Alert On

    Native SFMC notifications cover activity-level errors, but there are failure patterns that won’t trigger any built-in alert at all. Know all four:

    • Hard activity errors: A SQL query fails, an import file is missing, a script activity throws an exception. These are caught by native notifications and are the most visible failures.
    • Silent skipped records: An import activity processes but skips rows due to validation errors. The automation reports as “complete” โ€” no error notification fires. Your data is silently incomplete.
    • Automation never starts: A schedule drift, a UI save error, or a dependency issue means the automation simply doesn’t run. No error is thrown because nothing executed. This is the ghost failure.
    • Partial completion: Step 1 of 5 completes, Step 2 errors and stops. Downstream activities never run. Native alerts catch the error on Step 2 but won’t tell you what downstream impact occurred.

    For failures in categories 2, 3, and 4, you need monitoring logic beyond what SFMC provides out of the box โ€” which is why teams increasingly rely on external tools like Martech Monitoring to watch for automations that don’t run on schedule, not just automations that error when they do.

    Building an Alert Routing Strategy That Scales

    The goal is simple: the right person gets paged for a P1 failure, and nobody gets paged at 2am for a warning-level skipped record report. Here’s how to structure it.

    Tier Your Automations by Business Impact

    Before touching any notification settings, classify every automation in your instance into three tiers:

    • Tier 1 โ€“ Critical: Revenue-impacting, compliance-related, or feeds downstream systems (e.g., transactional sends, CRM syncs, suppression list imports). Failure requires immediate response.
    • Tier 2 โ€“ Important: Operational but recoverable within a business day (e.g., lead nurture programs, daily reporting). Failure should surface within hours.
    • Tier 3 โ€“ Low Priority: Nice-to-have automations where failure has minimal immediate business impact. Weekly digest, preference data aggregation, etc.

    Document this classification in a shared spreadsheet or your team’s wiki. It becomes the foundation for every alerting decision you make.

    Route Alerts by Tier, Not by Sender

    Once tiers are defined, configure notification recipients accordingly:

    • Tier 1 automations: Alert a distribution list that triggers a PagerDuty or Opsgenie incident, or at minimum routes to a Slack channel that has an on-call rotation. If your team doesn’t have an on-call process for marketing data, this is the moment to build one.
    • Tier 2 automations: Alert a team email alias that someone reviews every morning. Consider a dedicated sfmc-automation-alerts@yourcompany.com address that feeds into a monitored ticketing queue.
    • Tier 3 automations: Log the error but don’t alert urgently. A weekly digest review of Tier 3 failures is often sufficient.

    Defeating Alert Fatigue: The Practical Approach

    Alert fatigue is the silent killer of monitoring programs. When every notification looks the same โ€” regardless of severity โ€” humans learn to ignore them all. Here are specific tactics to prevent this in SFMC environments.

    Suppress Noise at the Source

    Audit your Automation Studio error logs for the last 30 days. Identify recurring errors that your team has already assessed as non-actionable. Common culprits include:

    • FTP import automations that error on weekends when source files aren’t generated (expected behavior, not a real failure)
    • SQL queries that return zero rows and are configured to error on empty results unnecessarily
    • Script activities with overly broad try/catch blocks that escalate warnings as errors

    Fix these at the automation level first. Change SQL activities to handle empty results gracefully. Adjust schedule windows to match when source data is actually available. Every non-actionable alert you eliminate is one fewer cry-wolf notification eroding your team’s trust in the system.

    Use Meaningful Subject Lines

    SFMC’s native notification emails have generic subject lines. When these arrive in a shared inbox, no one knows at a glance whether to escalate or ignore. If you’re routing alerts through a middleware tool or webhook (see below), customize the subject line to include:

    • Automation name
    • Failure tier (e.g., [CRITICAL] or [LOW])
    • Error type in plain language

    Example: [CRITICAL] Revenue Data Import โ€“ Import Activity Failed โ€“ Missing Source File tells the recipient everything they need to triage before opening the email.

    Extending Alerts Beyond Native SFMC: The API Approach

    For teams that need richer alerting logic, the SFMC REST API opens up significant options. You can use a Script Activity at the end of each automation to make an API call that logs completion status to an external system or triggers a conditional alert.

    // Script Activity - Automation Heartbeat to External Webhook
    var endpoint = 'https://your-monitoring-endpoint.com/sfmc/heartbeat';
    var payload = {
      automationName: 'Nightly Revenue Sync',
      status: 'complete',
      timestamp: Platform.Function.SystemDateToLocalDate(Now()),
      environment: 'Production'
    };
    
    var req = new Script.Util.HttpRequest(endpoint);
    req.emptyContentHandling = 0;
    req.retryCount = 2;
    req.encoding = 'UTF-8';
    req.method = 'POST';
    req.contentType = 'application/json';
    req.postData = Stringify(payload);
    
    var resp = req.send();
    

    Place this Script Activity as the final step in your Tier 1 automations. If the webhook doesn’t receive a heartbeat within the expected window, your external monitoring layer fires an alert. This catches the ghost failure scenario โ€” automations that never start โ€” which SFMC’s native tools cannot detect on their own.

    Platforms like Martech Monitoring are purpose-built for this pattern, monitoring automation run schedules and surfacing missed executions automatically without requiring you to build and maintain custom webhook infrastructure.

    Operationalizing Your Alert System: What Good Looks Like

    A mature SFMC alerting setup has these characteristics:

    • Every Tier 1 automation has a documented expected run window โ€” not just an error alert, but a “this should have run by X time” check.
    • Alert recipients are role-based distribution lists, not individual email addresses. When someone leaves, the alert coverage doesn’t leave with them.
    • There’s a monthly alert audit where the team reviews which alerts fired, which were acted on, and which were noise. Anything generating recurring noise gets investigated and fixed.
    • Runbooks exist for Tier 1 failures. When an alert fires at 11pm, the on-call person shouldn’t have to guess what to do. A short runbook per automation โ€” what the failure likely means, what to check first, who to escalate to โ€” dramatically reduces mean time to resolution.
    • Alerts are tested deliberately. At least once a quarter, intentionally break a Tier 1 automation in a sandboxed way to verify the full alert chain fires correctly and reaches the right people.

    Conclusion

    Effective SFMC automation alerting is less about enabling a notification email and more about building a system your team actually trusts and responds to. That means tiering your automations, routing alerts with purpose, eliminating noise at the source, and monitoring for failures that SFMC’s native tools simply can’t see โ€” like automations that never run.

    The teams that get this right catch failures before they impact customer sends or downstream data quality. The teams that don’t are still discovering four-day-old failures on Monday mornings.

    Want to automate your SFMC monitoring without building custom infrastructure? Check out Martech Monitoring โ€” built specifically to give SFMC teams visibility into automation health, missed runs, and deliverability issues before they become business problems.

  • How to Monitor Salesforce Marketing Cloud: The Complete 2026 Guide

    How to Monitor Salesforce Marketing Cloud: The Complete 2026 Guide

    If you’re responsible for a Salesforce Marketing Cloud instance, you already know that keeping it running smoothly requires more than just building campaigns and pressing “send.” Knowing how to monitor Salesforce Marketing Cloud effectively is what separates teams that react to problems from teams that prevent them. In this guide, we’ll cover everything you need to monitor in SFMC โ€” from automations and journeys to deliverability and data flows โ€” along with practical approaches for building a monitoring strategy that actually works in 2026.

    Why SFMC Monitoring Matters More Than Ever

    Marketing Cloud has grown significantly in complexity over the past few years. Most enterprise SFMC instances now include dozens of active automations, multiple Journey Builder campaigns, complex data extension architectures, API integrations with external systems, and cross-cloud connections to Sales Cloud, Service Cloud, and Data Cloud. Each of these components can fail independently, and a single failure can cascade across your entire marketing operation.

    The challenge is that SFMC’s built-in monitoring tools haven’t kept pace with this complexity. You get basic run history in Automation Studio, some contact-level journey analytics, and email tracking reports โ€” but there’s no unified dashboard that tells you “everything is healthy” or “here’s what needs your attention right now.” Building that visibility is on you.

    What to Monitor in Salesforce Marketing Cloud

    1. Automation Studio Health

    Automations are the backbone of most SFMC implementations. They handle data imports, SQL transformations, file transfers, email sends, and more. Here’s what you should be tracking:

    • Run status: Did each scheduled automation complete successfully? Track successes, failures, and skipped runs.
    • Run duration: How long is each automation taking? A gradual increase in run time is an early warning sign that queries are hitting data volume limits or that steps need optimization.
    • Step-level errors: Which specific activity within an automation failed? A SQL query timeout is a very different problem from an SFTP file-not-found error, and they require different fixes.
    • Schedule adherence: Are automations starting and finishing within their expected windows? Overlapping runs and schedule drift can cause data integrity issues downstream.

    2. Journey Builder Performance

    Journeys are harder to monitor than automations because they process contacts individually over time, rather than executing as a single batch. Key metrics to watch:

    • Entry rate: How many contacts are entering each journey per evaluation period? A sudden drop to zero usually means the entry source (API event, DE, or Salesforce data event) is broken.
    • Error rate by activity: Which journey steps are producing errors, and at what rate? A 2% error rate on an email send might be acceptable; a 40% error rate on a decision split means something is fundamentally wrong.
    • Contact throughput: Are contacts moving through the journey at the expected pace, or are they getting stuck at wait steps or bottlenecked at high-volume activities?
    • Goal and exit metrics: Are contacts reaching the journey’s goal at the expected conversion rate? A steep drop in goal attainment may indicate a problem further upstream in the journey logic.

    3. Email Deliverability and Send Health

    Email is still the primary channel for most SFMC users, and deliverability monitoring is critical to maintaining sender reputation and inbox placement:

    • Bounce rates: Track hard and soft bounce rates per send and over time. A spike in hard bounces may indicate a list hygiene issue or a problem with a data source feeding bad addresses into your audience.
    • Complaint rates: Monitor spam complaint rates closely. ISPs like Gmail and Yahoo use complaint rates as a primary factor in filtering decisions. Staying below 0.1% is the widely accepted threshold.
    • Send volumes and throughput: Are your sends completing in a reasonable timeframe? Unusually slow send throughput can indicate platform-level throttling or deliverability issues.
    • Engagement metrics: Open rates and click rates aren’t just marketing KPIs โ€” they’re deliverability signals. A sustained decline in engagement across multiple campaigns may mean your messages are landing in spam folders.

    4. Data Extension and Data Flow Integrity

    Bad data causes bad marketing. Monitor the health of your data layer:

    • Row counts: Track the row count of critical data extensions over time. A DE that should have 500,000 subscribers but suddenly shows 12 rows means an import went wrong.
    • Freshness: When was each key data extension last updated? If a DE that should refresh daily hasn’t been touched in 72 hours, something in the pipeline is broken.
    • Schema stability: Are critical data extension schemas being modified unexpectedly? Unplanned schema changes are one of the top causes of automation and query failures.
    • Import success rates: Track the success and error rates of file imports. Partial imports (where some rows succeed and others fail) can be especially dangerous because they don’t trigger a full failure alert but still result in incomplete data.

    5. API and Integration Health

    Modern SFMC implementations rely heavily on APIs โ€” both the SFMC REST/SOAP APIs and integrations with external systems:

    • API call volumes and error rates: Are your integrations making the expected number of API calls? Are any calls returning 4xx or 5xx errors?
    • Rate limit consumption: SFMC enforces API rate limits per business unit. If you’re approaching the limit, automated processes may start failing intermittently.
    • Marketing Cloud Connect sync status: If you’re using MC Connect to synchronize data from Sales or Service Cloud, monitor the sync frequency and watch for synchronization errors or stale data.

    6. User Activity and Governance

    In larger organizations, monitoring what users are doing inside SFMC is just as important as monitoring what the platform is doing:

    • Audit trail events: Who created, modified, or deleted automations, journeys, data extensions, or content? Tracking changes helps you correlate failures with specific modifications.
    • Permission changes: Were any user roles or business unit permissions changed? Unintended permission changes can break cross-BU automations and shared data access.

    Approaches to SFMC Monitoring

    The Manual Approach (And Why It Doesn’t Scale)

    Many teams start with a manual routine: log into Automation Studio each morning, scan for red icons, check a few key journeys, review yesterday’s send reports. This works when you have a handful of automations and one or two active journeys. It falls apart completely when you have 50+ automations, a dozen journeys, and sends happening around the clock across multiple business units.

    Manual checks also miss issues that happen between logins. If an automation fails at 2 AM and you don’t check until 9 AM, that’s seven hours of downstream impact โ€” missed sends, stale data, and broken customer experiences.

    The DIY Approach: Custom-Built Monitoring

    Some teams build their own monitoring layer using SFMC’s APIs. This typically involves:

    • Scheduled scripts that poll the Automation API for run statuses
    • Custom data extensions that log results over time
    • Alert emails triggered by error conditions
    • Dashboards built in a BI tool (Tableau, Datorama/Intelligence, or similar) that visualize trends

    This approach gives you full control, but it comes with significant costs: development time, ongoing maintenance, and the risk that your monitoring infrastructure itself becomes another thing that can break. Teams that go this route typically invest 40-80+ hours of initial development and several hours per month in maintenance.

    The Purpose-Built Approach: Dedicated SFMC Monitoring Tools

    The most efficient approach for most teams is to use a monitoring platform designed specifically for Salesforce Marketing Cloud. A tool like Martech Monitoring connects to your SFMC instance and provides out-of-the-box visibility into automations, journeys, sends, and data flows โ€” with real-time alerting that notifies you via email, Slack, or other channels the moment something goes wrong.

    The advantage of a purpose-built tool is that it understands SFMC’s specific failure modes and surfaces the right information without requiring you to build and maintain custom infrastructure. You get monitoring coverage from day one instead of spending weeks building scripts.

    Building Your SFMC Monitoring Strategy

    Regardless of which approach you choose, a solid monitoring strategy should include these elements:

    • Define what “healthy” looks like. For each automation, journey, and data flow, document the expected behavior: how often it should run, how many contacts it should process, and what a normal run duration looks like. You can’t detect anomalies without a baseline.
    • Classify by criticality. Not every automation is equally important. Identify your Tier 1 processes (revenue-impacting, customer-facing) and ensure they have the most aggressive monitoring and fastest alert response times.
    • Set up layered alerting. Use different alert channels for different severity levels. A non-critical automation failure might generate a Slack message; a failed journey that’s impacting thousands of customers should trigger an SMS or phone call.
    • Establish an incident response process. When an alert fires, who investigates? What’s the escalation path? How do you communicate impact to stakeholders? Having a documented process prevents chaos during high-pressure failures.
    • Review and refine monthly. Your SFMC instance is constantly evolving โ€” new automations, new journeys, new integrations. Revisit your monitoring coverage monthly to ensure new processes are covered and retired processes are removed.

    Get Started With SFMC Monitoring Today

    Monitoring Salesforce Marketing Cloud isn’t optional โ€” it’s foundational. Every campaign, every customer touchpoint, and every data pipeline depends on your SFMC instance running correctly. The question isn’t whether you need monitoring; it’s how much visibility you have right now and whether it’s enough to catch problems before they become emergencies. If you’re ready to move beyond manual spot-checks and get comprehensive, real-time visibility into your Marketing Cloud environment, start your free Martech Monitoring account and see exactly what’s happening in your SFMC instance โ€” right now.


    Take Action on Your SFMC Monitoring

    Download the free SFMC Monitoring Checklist รขโ‚ฌโ€ 27 critical items to monitor, with recommended frequencies and alert thresholds for each.

    Or watch the product demo to see how Martech Monitoring automates all of this for you รขโ‚ฌโ€ catching Journey failures, Automation errors, and Data Extension issues in minutes, not days.

    Start monitoring free รขโ‚ฌโ€ no credit card required.

  • Understanding Marketing Cloud Journey Errors: Causes, Diagnosis, and Prevention

    Understanding Marketing Cloud Journey Errors: Causes, Diagnosis, and Prevention

    Journey Builder is one of the most powerful tools in Salesforce Marketing Cloud, but it’s also one of the most complex โ€” and when things go wrong, Marketing Cloud journey errors can be difficult to untangle. A journey that quietly stops processing contacts, throws cryptic error codes, or delivers messages to the wrong audience can cause real damage to your customer experience and your team’s confidence in the platform. This guide will help you understand why journey errors happen, how to diagnose them efficiently, and what you can do to prevent them going forward.

    How Journey Builder Processes Contacts (And Where It Breaks)

    Before diving into specific errors, it helps to understand Journey Builder’s processing model. When a contact enters a journey, they move through a series of activities, decision splits, and wait steps on a per-contact basis. Each step is evaluated and executed asynchronously by SFMC’s backend. This means errors don’t always surface immediately โ€” a contact might enter a journey successfully but fail at step five, three days later, with no visible alert in the UI unless you go looking for it.

    This delayed-failure pattern is what makes journey errors so insidious. Unlike an automation that fails loudly at the scheduled run time, a journey can be “running” with a green status while silently dropping contacts at various stages.

    The Most Common Journey Builder Errors

    1. Contact Entry Source Failures

    The journey can’t process contacts that never enter it. Entry source errors are among the most common issues and typically stem from:

    • API event misconfiguration: The API event’s data extension schema doesn’t match the event definition, or the firing API call is sending malformed payloads.
    • Data extension entry source issues: The DE used as the entry source has no new records, the automation that populates it failed, or the contact key field doesn’t match Subscriber Key in All Subscribers.
    • Salesforce data entry events: The Marketing Cloud Connect integration is broken, the synchronized data source hasn’t refreshed, or field mappings have drifted after a Salesforce org change.

    Diagnosis tip: Check the journey’s Entry Source health by navigating to the journey canvas and clicking on the entry event. The “History” tab will show you how many contacts entered (or attempted to enter) over recent evaluation periods. If the number is zero when it shouldn’t be, the problem is upstream of the journey itself.

    2. Email Activity Errors

    Email send failures within journeys typically produce error codes that fall into a few categories:

    • Content errors: Personalization strings (AMPscript or dynamic content) that reference missing data extension fields, divide by zero, or produce null values. A single AMPscript error can prevent the email from rendering for that contact.
    • Subscriber status issues: The contact is unsubscribed, held, or bounced in All Subscribers. Journey Builder will skip the send but may not clearly flag this as an “error” โ€” the contact simply exits or gets stuck.
    • Send classification problems: An invalid sender profile, missing reply-to address, or deactivated delivery profile will cause the entire email activity to fail for all contacts passing through it.

    Diagnosis tip: Use Journey Builder Analytics to identify which email activity has a high error or skip rate. Then examine individual contact records by searching for a specific Subscriber Key in the journey’s contact history to see exactly which step failed and the associated error message.

    3. Decision Split and Engagement Split Errors

    Decision splits evaluate contacts against criteria (data extension values, contact attributes, or engagement data). Errors here typically arise from:

    • Null values in evaluated fields: If the decision split checks a field that’s NULL for a contact, the behavior depends on how the criteria was written. Contacts may unexpectedly fall through to the “No” path or get stuck entirely.
    • Stale data references: If the decision split references a data extension that has been deleted, renamed, or had its schema changed, the split can throw an evaluation error.
    • Engagement split timing: Engagement splits (opened email, clicked link) have a configurable wait period. If the wait period is too short, most contacts will appear as “not engaged” simply because they haven’t had time to interact yet.

    4. Wait Step and Timing Issues

    Wait steps seem simple, but they’re a frequent source of unexpected journey behavior:

    • Wait “until date” referencing a past date: If the contact attribute or DE field used for the wait-until date contains a date that has already passed, the contact may be released immediately or get stuck indefinitely, depending on the journey version and configuration.
    • Time zone mismatches: Journeys process in the account’s default time zone unless explicitly configured otherwise. If your wait step says “wait until 9 AM” but your audience spans multiple time zones, contacts may receive messages at unexpected local times.
    • Journey processing delays: During high-volume periods, SFMC’s journey processing queue can experience delays. Contacts may not advance through wait steps at precisely the expected time, leading to bunched sends.

    5. Update Contact and Custom Activity Errors

    Update Contact activities write data back to data extensions or contact attributes. These can fail when:

    • The target data extension or attribute group has been modified or deleted.
    • The value being written violates a data type constraint (e.g., writing text to a numeric field).
    • Custom activities that call external endpoints encounter HTTP errors, timeouts, or authentication failures.

    A Systematic Approach to Diagnosing Journey Errors

    When you suspect a journey is misbehaving, follow this diagnostic framework:

    1. Check the journey version status. Is it Running, Stopped, or in Draft? If someone accidentally stopped the journey or created a new version without activating it, contacts won’t be processing.
    2. Review the entry source health. Confirm that contacts are actually entering the journey at the expected rate. A zero-entry count points to an upstream data or integration problem.
    3. Examine the journey’s error count. On the journey canvas, each activity displays a count of contacts that errored at that step. Click through to identify the specific error messages.
    4. Trace individual contacts. Use the Contact Lookup feature to search for a specific subscriber key and follow their path through the journey. This will show you exactly where they are, where they stalled, and any error codes associated with their record.
    5. Check related automations and data flows. Journeys rarely operate in isolation. If the journey depends on an automation to populate its entry DE, or on a synchronized data source from Sales Cloud, verify that those upstream processes are running correctly.

    Preventing Journey Errors Proactively

    The most effective SFMC teams treat journey reliability as an ongoing discipline, not a one-time setup task. Here’s what that looks like in practice:

    • Validate entry data before it reaches the journey. Use a pre-processing automation with SQL queries to filter out contacts with missing or invalid data before they enter a journey. It’s far easier to catch bad data upstream than to debug why individual contacts are erroring inside a complex multi-step journey.
    • Test journeys with a controlled audience first. Before activating a journey at full scale, run it with a small test data extension containing known test records. Verify that each path, decision split, and activity works as expected with real (not hypothetical) data.
    • Monitor journey health continuously. Don’t assume a running journey is a healthy journey. Tools like Martech Monitoring can track journey error rates, entry counts, and activity failures in real time, alerting you the moment something deviates from expected behavior โ€” so you can intervene before thousands of contacts are affected.
    • Document your journey architecture. For complex, multi-branch journeys, maintain a plain-language document that explains the intended logic, the data extensions involved, the expected entry volume, and the dependencies on external systems. When something breaks six months from now, this documentation will save your team hours of reverse-engineering.
    • Review and prune regularly. Inactive or outdated journey versions consume system resources and create confusion. Set a quarterly cadence to review all active journeys, stop any that are no longer needed, and consolidate duplicates.

    Common Journey Error Codes and What They Mean

    Here are some of the error codes you may encounter in Journey Builder’s contact history and what they indicate:

    • Error 0: General processing error โ€” often a transient platform issue. If it persists, contact Salesforce Support.
    • Error 6: Contact was suppressed due to subscriber status (unsubscribed, held, or bounced).
    • Error 12: Email content rendering failure, typically caused by an AMPscript error.
    • Error 18: Data extension or attribute lookup failure โ€” the referenced data source may have been modified or removed.
    • Error 24: External activity (REST or custom) returned a non-success HTTP status code.

    Keep Your Journeys Running Smoothly

    Journey Builder errors are a fact of life in complex Marketing Cloud implementations, but they don’t have to be emergencies. With systematic diagnosis, proactive validation, and continuous monitoring, you can catch and resolve issues before they impact your customers. If you want to take the guesswork out of journey monitoring, sign up for Martech Monitoring and get visibility into every journey, automation, and data flow running in your SFMC account โ€” without building a single custom report.


    Take Action on Your SFMC Monitoring

    Download the free SFMC Monitoring Checklist รขโ‚ฌโ€ 27 critical items to monitor, with recommended frequencies and alert thresholds for each.

    Or watch the product demo to see how Martech Monitoring automates all of this for you รขโ‚ฌโ€ catching Journey failures, Automation errors, and Data Extension issues in minutes, not days.

    Start monitoring free รขโ‚ฌโ€ no credit card required.

  • Why Your SFMC Automation Stopped Working (And How to Fix It Fast)

    Why Your SFMC Automation Stopped Working (And How to Fix It Fast)

    Few things derail a marketing team’s day quite like discovering that an SFMC automation stopped working overnight. Emails didn’t send, data extensions weren’t updated, and your carefully orchestrated campaign is sitting idle. If you’re staring at a paused or errored automation in Salesforce Marketing Cloud right now, you’re not alone โ€” this is one of the most common (and most frustrating) issues SFMC administrators face. The good news: most automation failures have predictable causes and straightforward fixes.

    In this post, we’ll walk through the most frequent reasons automations break in Marketing Cloud, how to diagnose the root cause quickly, and what you can do to prevent these failures from happening again.

    The Most Common Reasons SFMC Automations Fail

    1. Data Extension Schema Changes

    This is the number-one culprit behind automation failures. If someone modifies a data extension that your automation depends on โ€” adding a column, removing a field, or changing a data type โ€” the automation’s SQL query or import activity can break silently. SFMC won’t always warn you in advance; it simply fails at runtime.

    What to check: Open the Activity tab of your automation and look at which step errored. If it’s an SQL Query or Import File activity, compare the target data extension’s current schema against what your query or file expects. Even a single renamed column can cause a complete failure.

    2. Expired or Revoked API Credentials

    Automations that rely on external data sources, SFTP file imports, or API-triggered sends will fail if the underlying credentials have expired. This is especially common with installed packages whose OAuth tokens have a set lifespan, or when someone rotates SFTP passwords without updating the corresponding File Transfer activity.

    What to check: Navigate to Setup > Installed Packages and verify that your server-to-server integrations are still active. For SFTP-based imports, confirm the credentials in your File Transfer activity match the current SFTP account details under Administration > Data Management > File Locations.

    3. SQL Query Timeouts

    Salesforce Marketing Cloud enforces a 30-minute timeout on SQL query activities. If your data extensions have grown significantly or your query involves multiple complex joins without proper filtering, the query may simply run out of time. The automation will report an error, but the error message (“Query failed”) is often unhelpfully vague.

    What to check: Run your SQL query manually in Query Studio and observe the execution time. If it’s approaching the 30-minute mark, you’ll need to optimize โ€” add WHERE clauses to limit row counts, break the query into smaller steps, or use indexed fields in your JOIN conditions.

    4. Send Classification or Delivery Profile Issues

    If your automation includes an email send activity, it can fail due to problems with the send classification, sender profile, or delivery profile. This often happens after org-wide changes โ€” for example, if a shared sender profile’s “From” address is modified or a CAN-SPAM compliance footer is removed from a delivery profile.

    What to check: Open the email send activity and verify each component: the send classification, sender profile, and delivery profile. Make sure the “From” email address is verified and that the physical mailing address in the delivery profile is populated.

    5. Business Unit Permission Conflicts

    In multi-business-unit SFMC environments, automations can fail when shared data extensions or shared content lose their sharing permissions. If an admin changes sharing rules at the enterprise level, an automation in a child business unit may suddenly lose access to a data extension it was reading from or writing to.

    What to check: Confirm that all data extensions referenced in your automation are still shared to the business unit where the automation runs. Check under Shared Items in the parent business unit’s data extension folder.

    6. Schedule Conflicts and Overlapping Runs

    SFMC does not allow an automation to start a new run while a previous run is still executing. If your automation takes longer than expected (due to growing data volumes) and the next scheduled run attempts to start, the new run will be skipped. Over time this can cascade into what appears to be a “stopped” automation even though its status still shows as Active.

    What to check: Review the automation’s run history in Automation Studio. Look for overlapping timestamps or “Skipped” entries. If runs are consistently taking longer than the interval between scheduled starts, you’ll need to either optimize the automation’s activities or increase the time between runs.

    How to Diagnose the Problem Quickly

    When an automation fails, follow this triage checklist:

    • Check the Run History: In Automation Studio, click on the automation and review the Activity tab. The step that failed will be highlighted in red. Note the exact error message and timestamp.
    • Examine the Error Log: For SQL queries, the error message usually indicates what went wrong (invalid column name, timeout, etc.). For imports, look for file-not-found or schema mismatch errors.
    • Test Each Step Manually: Run the failed activity in isolation. If it’s a SQL query, execute it in Query Studio. If it’s a file import, manually check the SFTP location for the expected file.
    • Review Recent Changes: Ask your team: did anyone modify a data extension, update credentials, change sharing rules, or deploy new content in the last 24-48 hours? Automation failures almost always correlate with a recent change.
    • Check SFMC System Status: Occasionally, the problem is on Salesforce’s end. Check trust.salesforce.com for any ongoing incidents affecting Marketing Cloud.

    Preventing Automation Failures Before They Happen

    The best fix is the one you never need. Here’s how experienced SFMC administrators keep their automations running reliably:

    • Implement proactive monitoring. Don’t wait for a stakeholder to notice that an email didn’t send. Use a monitoring solution like Martech Monitoring to get real-time alerts when automations fail, skip, or run longer than expected. Catching failures within minutes โ€” rather than hours or days โ€” drastically reduces the impact on your campaigns.
    • Document your data extension dependencies. Maintain a simple map of which automations read from and write to which data extensions. When someone needs to change a schema, they can check the map first and update dependent queries before they break.
    • Set up error-handling automations. Create a “watchdog” automation that checks whether critical data extensions were updated within their expected timeframes. If a key DE hasn’t been refreshed by 8 AM, the watchdog can send an alert email to your ops team.
    • Rotate credentials on a schedule. Don’t wait for API keys or SFTP passwords to expire unexpectedly. Set calendar reminders to rotate them proactively and update all dependent automations at the same time.
    • Optimize SQL queries as data grows. Review your query execution times quarterly. What ran fine on 500,000 rows may time out on 5 million. Add indexes, tighten WHERE clauses, and consider breaking monolithic queries into staged steps.
    • Use naming conventions and folder structures. Clearly name your automations, queries, and data extensions so that anyone on the team can understand what depends on what. Sloppy naming leads to accidental modifications and broken dependencies.

    When to Escalate to Salesforce Support

    If you’ve exhausted the troubleshooting steps above and your automation is still failing with vague or inconsistent error messages, it may be time to open a case with Salesforce Support. Provide them with:

    • The automation name and MID (Member ID) of the business unit
    • The exact error message and timestamps from the run history
    • A description of what changed before the failure started
    • Confirmation that you’ve tested each step in isolation

    Salesforce support can access server-side logs that aren’t visible in the Automation Studio UI, which can reveal underlying platform issues.

    Don’t Let Broken Automations Derail Your Campaigns

    SFMC automation failures are inevitable โ€” but slow detection isn’t. The teams that recover fastest are the ones who know about failures before anyone else does. If you’re tired of discovering broken automations hours (or days) after the fact, try Martech Monitoring free and get instant alerts the moment something goes wrong in your Marketing Cloud account. Your future self โ€” and your campaign stakeholders โ€” will thank you.


    Take Action on Your SFMC Monitoring

    Download the free SFMC Monitoring Checklist รขโ‚ฌโ€ 27 critical items to monitor, with recommended frequencies and alert thresholds for each.

    Or watch the product demo to see how Martech Monitoring automates all of this for you รขโ‚ฌโ€ catching Journey failures, Automation errors, and Data Extension issues in minutes, not days.

    Start monitoring free รขโ‚ฌโ€ no credit card required.