Difference between revisions of "How Does CanIt Work?"

From Roaring Penguin
Jump to: navigation, search
(Your Quarantine)
(Scanning)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
:''Main article: [[Quick Start Guide]]''
 
:''Main article: [[Quick Start Guide]]''
  
CanIt has a complex set of rules and permissions that allow for very fine-tuned control over the processing of mail. This sub-article discusses the main concepts in how this is done.
+
CanIt has a complex set of rules and settings that allow for very fine-tuned control over the processing of mail. This sub-article discusses the main concepts in how this is done.
  
 
=Please Note=
 
=Please Note=
 +
 
The options within this guide may or not apply to you, depending on your version of CanIt and how it has been set up by the e-mail administrator. If you can't find something as described but think that you need it, consult your administrator; there may be a reason that you don't have it.
 
The options within this guide may or not apply to you, depending on your version of CanIt and how it has been set up by the e-mail administrator. If you can't find something as described but think that you need it, consult your administrator; there may be a reason that you don't have it.
  
=Streams=
+
=What is a Stream?=
 +
 
 +
When mail arrives, CanIt will look up the address to determine if it exists, and if it does, it will try to determine who should be responsible for it. This process, called Streaming, will usually result in each email address having a unique stream name unless your administrator has it configured otherwise. This name is essentially your "Account" except that in certain circumstances you might have access to more than one stream. The stream associated with any given email address contains the most important set of rules and settings for all mail sent to that address and will house and mail that is trapped for that address.
  
When mail arrives, CanIt will look up the address and find a set of rules. The most important set of rules are derived from the "stream" that has been assigned to that address. The stream also tells the system who should receive mail for that address. Usually this will mean the one email address that corresponds directly the stream, but it is also possible for the stream to do something different.
+
Streams also allow for situations where addresses and inboxes do not have a one-to-one relationship.
  
==Distribution Groups==
+
==One Address to Multiple Recipients==
  
This allows for multiple email addresses to be associated with a single stream, so that copy of all messages sent to a stream will be delivered to each of the associated addresses. This can be useful, for example, if you and your coworkers are all associated with the stream for sales@example.com; even though no one user actually has that email address, mail addressed there will be sent to all of you.
+
If mail comes in for a group address these will generally be given their own streams. As a result, all recipients within that group will be notified to any trapped messages for that stream. Administrators will often point mail for a group address to a specific user's stream so that they will be in charge of it's trapped mail, or they will make it only accessible to themselves.
  
==Aliases==
+
==Multiple Addresses to One Recipient==
  
Instead of one stream going to many people, streams also allow for multiple addresses to go to the same person. This allows you to be set up with multiple aliases that are able to receive mail, while only having to keep tabs on a single inbox. For example, you could be set up to use and monitor only janedoe@example.com, but you are could also receive mail for jdoe@example.com and janed@example.com. This can also function on a domain-wide level so that username@example.com will also receive mail from username@example.org.
+
Instead of one stream going to many people, multiple addresses can also be set up to a use a single stream. This was just mentioned in the case of a group address going to an individual, as their stream then houses mail for both their personal mail as well as the group's. This also allows for a single set of rules and settings to process mail for multiple email addresses, as well as to maintain a single quarantine that hold mail for all of them. This can be helpful if you have multiple alias addresses, in that you will not have separate collections of rules and trapped mail. This can be done in two ways:
 +
* All mail can be delivered to the original recipient address (See Preferences->My Addresses in the WebUI) while sharing the same stream.
 +
* Mail for secondary addresses can be rewritten to the primary address and will be delivered there instead (See Preferences->Aliases in the WebUI).
 +
An administrator can also set this up to function on a domain-wide level so that - for example - user1@example.com will also receive mail from user1@example.org in either of the methods above.
  
 
=Who Makes the Rules?=
 
=Who Makes the Rules?=
 
:''See diagram: [[Rule Inheritance]]''
 
  
 
==Stream Rules==
 
==Stream Rules==
  
Along with figuring out how tho route e-mails, streams also contain the rules a set of rules with the highest priority. You will have a varying level of control over how many rules you can adjust depending on what your administrator decides to allow. Unless your administrator decides otherwise, your stream is the only place that you will have any control. This allows you to decide to do something like blocking badjokesbilly@example.com successfully without affecting any of your peers.
+
You will have a varying level of control over what types of rules you can create and modify depending on what your administrator decides to allow. Unless your administrator decides otherwise, your stream is the only place that you will have any control. This allows you to decide to make rules that will impact the flow of your own mail without impacting any of your co-workers. These are also the highest priority rules, so what you say will be honoured. This is also why your admin might limit what types you can use.
  
==Default Stream==
+
==Higher Laws==
  
There will be many rules that you do not have specified in your stream, either because you have no need to, or because you are not allowed to access them, so CanIt must look further up the line to figure out what it is allowed to do.
+
CanIt will also look higher up for extra rules that do not conflict with your own. These can be rules that apply to all users in your organization, all users within the entire CanIt implementation or some subset in between. Regardless of the structure, the most specific rule that you inherit from directly will always be used.
  
After your stream it will look for any rules that are not yet specified in the default stream for your domain. Depending on how your administrator has things set up, this might be the only stream and you may have control over nothing, but the majority of the time this stream is used to provide rules that the admin thinks are applicable to everyone using your domain unless otherwise specified.
+
:''See diagram: [[Rule Inheritance]]''
  
==Higher Laws==
+
=Scanning=
  
After this, the structure may be a little bit different depending on the structure of your organization, whether your service is hosted or on-premises, and whether you are managed by a service provider or your own internal IT department. Generally speaking, CanIt will continue drawing from higher and higher up the chain until it reaches the base rules that apply system-wide. If you use our hosted solution these may be rules that apply to thousands of companies with millions of users, or it may be that you are using our CanIt-Pro product and your domain is as high as the chain goes. Regardless, this base set has all rules defined and so, by the time CanIt makes it here it has all of the information it needs to proceed.
+
At this point CanIt will have very specific rules, unique to your needs, that will be use to determine what happens to your mail. There are a lot of things for it to check and some forms of scanning can be a lot of work for the servers running CanIt, so it tries to save itself as much trouble as possible by only scanning what it has to.
  
=Scanning=
+
One way that obvious spam is weeded out is with greylisting. This is the process of temporarily sending a "busy" signal to senders the first time that they try to send a message to any given recipient. Spammers will rarely bother to try twice, so that mail will never be scanned, however properly configured machines will dutifully retry moments later.
  
At this point CanIt has very specific rules on what it will and won't let through and now it has to enforce them. There are a lot of things for it to check and some forms of scanning can be a lot of work for the servers running CanIt, so it tries to save itself as much trouble as possible by scanning as little as possible. It does this first by checking for certain exceptions like a whitelist that tells it to let the message through no matter what, or a blacklists that tells it to throw away the message no matter what. These are powerful tools that often only your administrator will have access to. If your administrator does allow you to use these tools, be aware of the following:
+
If the sender is known to retry, there are still other exceptions that will cause mail to skip all other checks. The first is a virus scan which will automatically discard the message regardless of any other settings. Similarly, if there is a Block rule for the sender or domain, these will also be rejected without any other scanning being done. On the other hand, an Always-Allow rule tells it to let the message through without doing any other scanning. This happens after the virus scan, so viruses sent from Allowed senders will still be blocked. The Allow rules will also be ignored it the event of an [[SPF fail]] which indicates that the sender is spoofing a domain that it is not allowed to use. The Block/Allow rules work based on specificity, as described above, where the most specific rule always wins (ie. a Sender rule trumps a Domain rule, and a rule set in your stream trumps one that you have inherited).
  
<ul>
+
Absolute Block and Always-Allow rules are powerful tools that often only your administrator will have access to. If your administrator does allow you to use these tools, be aware of the following:
<li>Spammers rarely use the same address twice, so a blacklist is not generally very effective. These rules are mostly effective for blocking junk-mail like e-retailer newsletters that always come from the same address, but which have a tedious unsubscribe process.</li>
 
<li>Spammers will occasionally spoof the address of trusted senders, or will propagate mail from a trusted machine which can lead to obvious spam making it through, supposedly from a legitimate sender. This is uncommon, but you need to be aware that spam can easily make it through if that person claims to be on your whitelist.</li>
 
</ul>
 
  
Once this is done CanIt will scan for a huge number of spam indicators, including checking for a reputable sender, checking the frequency of spammy keywords in the body of the message, checking for malicious attachments, checking to see if it was sent from where it claims to be sent from and much more. These will all attempt to limit the number of e-mails that are able to make it to the inbox.
+
* Spammers rarely use the same address twice, so a Block rule is not generally very effective. These rules are mostly effective for blocking junk-mail such as e-retailer newsletters that always come from the same address, but which have a tedious unsubscribe process.
 +
* Spammers will occasionally spoof the address of trusted senders, or will propagate mail from an infected machine that may belong to a trusted sender which can lead to obvious spam making it through due to an Allow rule.
  
On the other hand, CanIt also provides tools and performs checks to help save valid emails. This checks many of the same factors including senders, place of origin, keywords and so on but with the intention to keep friendly emails from being misidentified.
+
If none of the exception above apply, the message will be tested against all other rules in the system and will be given a score that indicates the confidence that that message is spam. Some of these tests include:
 +
*looking for suspicious information in the headers and metadata.
 +
*performing content analysis on the headers, body and attachments in the message.
 +
*searching reputation lists for known-bad senders, links and IP addresses.
 +
*filtering by attachment file types (including those in archives)
 +
*searching for auto-download functions which pull down malware after the attachments such as Microsoft Office and PDFs are opened.
 +
*authenticity checks (SPF, DKIM and DMARC), reverse DNS records, and other common legitimacy checks.
  
The combinations and specificity of these rules are endless. A stream can be set up with hundreds of rules to identify broad ranges of emails, while it also could have a single specific rule that says absolutely never accept a message from any of Billy's 3 addresses with a subject containing the word 'funny' or an attachment with the file extensions .jpg or .png.
+
=Where Does it All Go?=
  
=Where Does it All Go?=
+
After all the scanning is done the mail will have done one of 3 things:
 +
* Failed an absolute rule and will have been rejected (or discarded)
 +
* Passed an absolute rule and will have skipped all other checks and been delivered.
 +
* It will have been given a score indicating how spammy CanIt thinks the message looks.
  
After all the scanning is done the mail will have either failed an absolute rule and will have been rejected completely, or it will have been given a score. This score is a summed total of all of the different tests that it has gone through. It may have gotten 2 points for having suspicious sounding words, 1 point for using a fake email address and 3 points for originating from Tanzania, for example. This score is then compared against a few thresholds.
+
In the third case, CanIt then uses 3 [[Thresholds|thresholds]] to determine where the message should go based on that score.
  
 
==Your Inbox==
 
==Your Inbox==
  
If it is lower than the spam threshold it is passed on to your inbox. We recommend a spam threshold of 5 points and highly discourage anyone from adjusting this by more than about 0.2 points at a time. Even this much change higher will often result in a significant amount of spam getting through, since a lot of mail will only fail one test and several tests allot exactly 5 points.
+
If it is lower than the quarantine threshold (S-300) it is passed on to your inbox. We recommend a spam threshold of 5 points and highly discourage anyone from adjusting this by more than about 0.2 points at a time. Even this much change higher will often result in a significant amount of spam getting through, since a lot of mail will only fail one test and several tests allot exactly 5 points.
 
 
Our example message scored 6 points and so it failed the spam test and will not be allowed in to your inbox. This means it will proceed to the next check.
 
  
 
==Your Pending Messages==
 
==Your Pending Messages==
  
The next threshold is a little more flexible and decides what range of scores to leave as pending in the quarantine. We have this set by default to 2000 so that there is almost never a false-positive, however, this means that most spam will end up as pending messages as well. Many administrator will set this number somewhere around 20 and so that anything between 5 points and 20 points is deemed to be suspicious, but is not so suspicious as to be immediately rejected. It is these messages that you will generally be concerned about as a CanIt end-user.
+
If it is higher than the quarantine threshold then it will be compared to the reject threshold (S-100). If it is lower than this threshold then it will be held in the pending quarantine. It is these messages that you will generally be concerned about as a CanIt end-user. You may receive regular reports alerting you to these trapped messages or may have access to them in the WebUI. It is also possible that you map never see these messages and that you administrator may monitor this list for you.
  
 
==Automatic Rejection==
 
==Automatic Rejection==
  
If the spam score had been above the quarantine threshold it would have been automatically rejected, but would face one more threshold. This is to see whether it should be completely discarded, or logged as a rejected message. If it is above this threshold there will be no trace of the message left on the system, otherwise your administrator may be able to recover it from the logs.
+
If it is higher than the reject threshold it is then compared to the discard threshold (S-200). If it is lower than this threshold then it will be put into the Spam quarantine (Quarantine->Spam in the WebUI). You will generally not notice these messages unless you have access to the WebUI and you specifically go looking for them. They will not show up in notifications but are still recoverable if necessary.  
 +
 
 +
==Automatic Discard==
 +
 
 +
If it is higher than the discard threshold it will be tossed out and there will not be recoverable. The only thing that remains of it is a log entry so that an administrator can find out what happened to it if necessary.
  
 
:''Continue to: [[CanIt User Essentials]]''
 
:''Continue to: [[CanIt User Essentials]]''
 +
 +
<div style="float:right; clear:both; margin-right:0.5em">[[Support Wiki | [Home]]]</div>
 +
[[category:All]][[category:Quick Start Guide]]

Latest revision as of 16:02, 16 August 2017

Main article: Quick Start Guide

CanIt has a complex set of rules and settings that allow for very fine-tuned control over the processing of mail. This sub-article discusses the main concepts in how this is done.

Please Note

The options within this guide may or not apply to you, depending on your version of CanIt and how it has been set up by the e-mail administrator. If you can't find something as described but think that you need it, consult your administrator; there may be a reason that you don't have it.

What is a Stream?

When mail arrives, CanIt will look up the address to determine if it exists, and if it does, it will try to determine who should be responsible for it. This process, called Streaming, will usually result in each email address having a unique stream name unless your administrator has it configured otherwise. This name is essentially your "Account" except that in certain circumstances you might have access to more than one stream. The stream associated with any given email address contains the most important set of rules and settings for all mail sent to that address and will house and mail that is trapped for that address.

Streams also allow for situations where addresses and inboxes do not have a one-to-one relationship.

One Address to Multiple Recipients

If mail comes in for a group address these will generally be given their own streams. As a result, all recipients within that group will be notified to any trapped messages for that stream. Administrators will often point mail for a group address to a specific user's stream so that they will be in charge of it's trapped mail, or they will make it only accessible to themselves.

Multiple Addresses to One Recipient

Instead of one stream going to many people, multiple addresses can also be set up to a use a single stream. This was just mentioned in the case of a group address going to an individual, as their stream then houses mail for both their personal mail as well as the group's. This also allows for a single set of rules and settings to process mail for multiple email addresses, as well as to maintain a single quarantine that hold mail for all of them. This can be helpful if you have multiple alias addresses, in that you will not have separate collections of rules and trapped mail. This can be done in two ways:

  • All mail can be delivered to the original recipient address (See Preferences->My Addresses in the WebUI) while sharing the same stream.
  • Mail for secondary addresses can be rewritten to the primary address and will be delivered there instead (See Preferences->Aliases in the WebUI).

An administrator can also set this up to function on a domain-wide level so that - for example - user1@example.com will also receive mail from user1@example.org in either of the methods above.

Who Makes the Rules?

Stream Rules

You will have a varying level of control over what types of rules you can create and modify depending on what your administrator decides to allow. Unless your administrator decides otherwise, your stream is the only place that you will have any control. This allows you to decide to make rules that will impact the flow of your own mail without impacting any of your co-workers. These are also the highest priority rules, so what you say will be honoured. This is also why your admin might limit what types you can use.

Higher Laws

CanIt will also look higher up for extra rules that do not conflict with your own. These can be rules that apply to all users in your organization, all users within the entire CanIt implementation or some subset in between. Regardless of the structure, the most specific rule that you inherit from directly will always be used.

See diagram: Rule Inheritance

Scanning

At this point CanIt will have very specific rules, unique to your needs, that will be use to determine what happens to your mail. There are a lot of things for it to check and some forms of scanning can be a lot of work for the servers running CanIt, so it tries to save itself as much trouble as possible by only scanning what it has to.

One way that obvious spam is weeded out is with greylisting. This is the process of temporarily sending a "busy" signal to senders the first time that they try to send a message to any given recipient. Spammers will rarely bother to try twice, so that mail will never be scanned, however properly configured machines will dutifully retry moments later.

If the sender is known to retry, there are still other exceptions that will cause mail to skip all other checks. The first is a virus scan which will automatically discard the message regardless of any other settings. Similarly, if there is a Block rule for the sender or domain, these will also be rejected without any other scanning being done. On the other hand, an Always-Allow rule tells it to let the message through without doing any other scanning. This happens after the virus scan, so viruses sent from Allowed senders will still be blocked. The Allow rules will also be ignored it the event of an SPF fail which indicates that the sender is spoofing a domain that it is not allowed to use. The Block/Allow rules work based on specificity, as described above, where the most specific rule always wins (ie. a Sender rule trumps a Domain rule, and a rule set in your stream trumps one that you have inherited).

Absolute Block and Always-Allow rules are powerful tools that often only your administrator will have access to. If your administrator does allow you to use these tools, be aware of the following:

  • Spammers rarely use the same address twice, so a Block rule is not generally very effective. These rules are mostly effective for blocking junk-mail such as e-retailer newsletters that always come from the same address, but which have a tedious unsubscribe process.
  • Spammers will occasionally spoof the address of trusted senders, or will propagate mail from an infected machine that may belong to a trusted sender which can lead to obvious spam making it through due to an Allow rule.

If none of the exception above apply, the message will be tested against all other rules in the system and will be given a score that indicates the confidence that that message is spam. Some of these tests include:

  • looking for suspicious information in the headers and metadata.
  • performing content analysis on the headers, body and attachments in the message.
  • searching reputation lists for known-bad senders, links and IP addresses.
  • filtering by attachment file types (including those in archives)
  • searching for auto-download functions which pull down malware after the attachments such as Microsoft Office and PDFs are opened.
  • authenticity checks (SPF, DKIM and DMARC), reverse DNS records, and other common legitimacy checks.

Where Does it All Go?

After all the scanning is done the mail will have done one of 3 things:

  • Failed an absolute rule and will have been rejected (or discarded)
  • Passed an absolute rule and will have skipped all other checks and been delivered.
  • It will have been given a score indicating how spammy CanIt thinks the message looks.

In the third case, CanIt then uses 3 thresholds to determine where the message should go based on that score.

Your Inbox

If it is lower than the quarantine threshold (S-300) it is passed on to your inbox. We recommend a spam threshold of 5 points and highly discourage anyone from adjusting this by more than about 0.2 points at a time. Even this much change higher will often result in a significant amount of spam getting through, since a lot of mail will only fail one test and several tests allot exactly 5 points.

Your Pending Messages

If it is higher than the quarantine threshold then it will be compared to the reject threshold (S-100). If it is lower than this threshold then it will be held in the pending quarantine. It is these messages that you will generally be concerned about as a CanIt end-user. You may receive regular reports alerting you to these trapped messages or may have access to them in the WebUI. It is also possible that you map never see these messages and that you administrator may monitor this list for you.

Automatic Rejection

If it is higher than the reject threshold it is then compared to the discard threshold (S-200). If it is lower than this threshold then it will be put into the Spam quarantine (Quarantine->Spam in the WebUI). You will generally not notice these messages unless you have access to the WebUI and you specifically go looking for them. They will not show up in notifications but are still recoverable if necessary.

Automatic Discard

If it is higher than the discard threshold it will be tossed out and there will not be recoverable. The only thing that remains of it is a log entry so that an administrator can find out what happened to it if necessary.

Continue to: CanIt User Essentials