A two-pronged iOS release cycle

One noticeable aspect of this year’s WWDC keynote was a lack of any new features focused on the iPad. Federico Viticci has written about this this on Mac Stories, in which he said:

I wouldn’t be surprised to see Apple move from a monolithic iOS release cycle to two major iOS releases in the span of six months – one focused on foundational changes, interface refinements, performance, and iPhone; the other primarily aimed at iPad users in the Spring.

I think this is a very plausible scenario, and between iOS 9.3 and WWDC, it seems like it might be coming true. Why? Education.

Apple doesn’t release breakdowns, but a big chunk of iPad sales seems to come from the education market. Education runs to a fixed schedule: the academic year starts in the autumn, continues over winter and spring, with a long break in the summer. A lot of work happens in the summer break, which includes lesson plans for the coming year, and deploying new tech.

The traditional iOS release cycle – preview the next big release at WWDC, release in the autumn – isn’t great for education. By the time the release ships, the school year is already underway. That can make it difficult for schools to adopt new features, often forcing them to wait for the next academic year.

If you look at the features introduced in iOS 9.3 – things like Shared iPad, Apple School Manager, or Managed Apple ID – these aren’t things that can be rolled out mid-year. They’re changes at the deployment stage. Once students have the devices, it’s too late. Even smaller things, like changes to iTunes U, can’t be used immediately, because they weren’t available when lesson plans were made over the summer. (And almost no teachers are going to run developer previews.)

This means there’s less urgency to get education-specific iPad features into the autumn release, because it’s often a year before they can be deployed. In a lot of cases, deferring that work for a later release (like with iOS 9.3) doesn’t make much of a difference for schools. And if you do that, it’s not a stretch to defer the rest of the iPad-specific work, and bundle it all into one big release that focuses on the iPad. Still an annual cycle, but six months offset from WWDC.

Moving to this cycle would have other benefits. Splitting the releases gives Apple more flexibility: they can spread the engineering work across the whole year, rather than focusing on one massive release for WWDC. It’s easier to slip an incomplete feature if the next major release is six months away, not twelve. And it’s a big PR item for the spring, a time that’s usually quiet for official Apple announcements.

I don’t know if this is Apple’s strategy; I’m totally guessing. But it seems plausible, and I’ll be interested to see if it pans out into 2017.


A subscription for my toothbrush 

Last weekend, I had my annual dental check-up. Thankfully, everything was fine, and my teeth are in (reasonably) good health.

While I was at the dentist, I had the regular reminder to change my toothbrush on a regular basis. You’re supposed to replace your toothbrush every three months or so: it keeps the bristles fresh and the cleaning effective. If left to my own devices, I would probably forget to do this.

To help me fix this, I buy my toothbrushes through Amazon’s “Subscribe & Save” program. I have a subscription to my toothbrushes: I tell Amazon what I want to buy, and how often I want it, then they remember when to send me a package. So every six months, I get a new pack of brushes.

Amazon always email you a few days before they place they send you the next subscription, so there’s a chance to cancel it if it’s no longer relevant. It doesn’t come completely out of the blue. And there’s a place for managing your subscriptions on your account page.

I’m sure I could remember to buy toothbrushes myself if I put my mind to it, but it’s quite nice that Amazon will do it for me. It’s just one less thing for me to think about.


Reading web pages on my Kindle

Of everything I’ve tried, my favourite device for reading is still my e-ink Kindle. Long reading sessions are much more comfortable on a display that isn’t backlit.

It’s easy to get ebooks on my Kindle – I can buy them straight from Amazon. But what about content I read on the web? If I’ve got a long article on my Mac that I’d like to read on my Kindle instead, how do I push it from one to the other?

There’s a Send to Kindle Mac app available, but that can only send documents on disk. I tried it a few times – save web pages to a PDF or HTML file, then send them to my Kindle through the app – but it was awkward, and the quality of the finished files wasn’t always great. A lot of web pages have complex layouts, which didn’t look good on the Kindle screen.

But I knew the folks at Instapaper had recently opened an API allowing you to use the same parser as they use in Instapaper itself. You give it a web page, and it passes back a nice, cleaned-up version that only includes the article text. Perhaps I could do something useful with that?

I decided to write a Python script that would let me send articles to a Kindle – from any device.

Continue reading →


Introduction to property-based testing 

On Tuesday night, I was talking about testing techniques at the Cambridge Python User Group (CamPUG for short). I was talking primarily about property-based testing and the Python library Hypothesis, but I included an introduction to the ideas of stateful testing, and fuzz testing with american fuzzy lop (afl).

I was expecting the traditional property-based testing would be the main attraction, and the stateful and fuzz testing would just be nice extras. In fact, I think interest was pretty evenly divided between the three topics.

I’ve posted the slides and my rough notes. Thanks to everybody who came on Tuesday – I had a really good evening.


Finding 404s and broken pages in my Apache logs

Sometime earlier this year, I broke the Piwik server-side analytics that I’d been using to count hits to the site. It sat this way for about two months before anybody noticed, which I took as a sign that I didn’t actually need them. I look at them for vanity, nothing more.

Since then, I’ve been using Python to parse my Apache logs, an idea borrowed from Dr. Drang. All I want is a rough view count, and if I work on the raw logs, then I can filter out a lot of noise from things like bots and referrer spam. High-level tools like Piwik and Google Analytics make it much harder to prune your results.

My Apache logs include a list of all the 404 errors: any time that somebody (or something) has found a missing page. This is useful information, because it tells me if I’ve broken something (not unlikely, see above). Although I try to have a helpful 404 page, that’s no substitute for fixing broken pages. So I wrote a script that looks for 404 errors in my Apache logs, and prints the most commonly hit pages – then I can decide whether to fix or ignore them.

The full script is on GitHub, along with some instructions. Below I’ll walk through the part that actually does the hard work.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
page_tally = collections.Counter()

for line in sys.stdin:

    # Any line that isn't a 404 request is uninteresting.
    if '404' not in line:
        continue

    # Parse the line, and check it really is a 404 request; otherwise,
    # discard it.  Then get the page the user was trying to reach.
    hit = PATTERN.match(line).groupdict()
    if hit['status'] != '404':
        continue
    page = hit['request'].split()[1]

    # If it's a 404 that I know I'm not going to fix, discard it.
    if page in WONTFIX_404S:
        continue

    # If I fixed the page after this 404 came in, I'm not interested
    # in hearing about it again.
    if page in FIXED_404S:
        time, _ = hit["time"].split()
        date = datetime.strptime(time, "%d/%b/%Y:%H:%M:%S").date()
        if date <= FIXED_404S[page]:
            continue

        # But I definitely want to know about links I thought I'd
        # fixed but which are still broken.
        print('!! ' + page)
        print(line)
        print('')

    # This is a 404 request that we're interested in; go ahead and
    # add it to the counter.
    page_tally[page] += 1

for page, count in page_tally.most_common(25):
    print('%5d\t%s' % (count, page))

I’m passing the Apache log in to stdin, and looping over the lines. Each line corresponds to a single hit.

On lines 6–7, I’m throwing away all the lines that don’t contain the string “404”. This might let through a few lines that aren’t 404 results – I’m not too fussed. This is just a cheap heuristic to avoid (relatively) slow parsing of lots of lines that I don’t care about.

On lines 11–14, I actually parse the line. My PATTERN regex for parsing the Apache log format comes from Dr. Drang’s post. Now I actually can properly filter for 404 results only, and discard the rest. The request parameter is usually something like GET /about/ HTTP/1.1 – a method, a page and an HTTP version. I only care about the page, so throw away the rest.

Like any public-facing computer, my server is crawled by bots looking for unpatched versions of WordPress and PHP. They’re looking for login pages where they can brute force credentials or exploit known vulnerabilities. I don’t have PHP or WordPress installed, so they show up as 404 errors in my logs.

Once I’m happy that I’m not vulnerable to whatever they’re trying to exploit, I add those pages to WONTFIX_404S. On lines 17–18, I ignore any errors from those pages.

The point of writing this script is to find, and fix, broken pages. Once I’ve fixed the page, the hits are still in the historical logs, but they’re less interesting. I’d like to know if the page is still broken in future, but I already know that it was broken in the past.

When I fix a page, I add it to FIXED_404S, a dictionary in which the keys are pages, and the values are the date on which I think I fixed it. On lines 22–32, I throw away any broken pages that I’ve acknowledged and fixed, if they came in before the fix. But then I highlight anything that’s still broken, because it means my fix didn’t work.

Any hit that hasn’t been skipped by now is “interesting”. It’s a 404’d page that I don’t want to ignore, and that I haven’t fixed in the past. I add 1 to the tally of broken pages, and carry on.

I’ve been using the Counter class from the Python standard library to store my tally. I could use a regular dictionary, but Counter helps clean up a little boilerplate. In particular, I don’t have to initialise a new key in the tally – it starts at a default of 0 – and at the end of the script, I can use the most_common() method to see the 404’d pages that are hit most often. That helps me prioritise what pages I want to fix.

Here’s a snippet from the output when I first ran the script:

23656   /atom.xml
14161   /robots.txt
 3199   /favicon.ico
 3075   /apple-touch-icon.png
  412   /wp-login.php
  401   /blog/2013/03/pinboard-backups/

Most of the actually broken or missing pages were easy to fix. In ten minutes, I fixed ~90% of the 404 problems that had occurred since I turned on Apache last August.

I don’t know how often I’ll actually run this script. I’ve fixed the most common errors; it’ll be a while before I have enough logs to make it worth doing another round of fixes. But it’s useful to have in my back pocket for a rainy day.


A Python smtplib wrapper for FastMail

Sometimes I want to send email from a Python script on my Mac. Up to now, my approach has been to shell out to osascript, and use AppleScript to invoke Mail.app to compose and send the message. This is sub-optimal on several levels:

  • It relies on Mail.app having up-to-date email config;
  • The compose window of Mail.app briefly pops into view, stealing focus from my main task;
  • Having a Python script shell out to run AppleScript is an ugly hack.

Plus it was a bit buggy and unreliable. Not a great solution.

My needs are fairly basic: I just want to be able to send a message from my email address, with a bit of body text and a subject, and optionally an attachment or two. And I’m only sending messages from one email provider, FastMail.

Since the Python standard library includes smtplib, I decided to give that a try.

After a bit of mucking around, I came up with this wrapper:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
from email import encoders
from email.mime.base import MIMEBase
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import smtplib

class FastMailSMTP(smtplib.SMTP_SSL):
    """A wrapper for handling SMTP connections to FastMail."""

    def __init__(self, username, password):
        super().__init__('mail.messagingengine.com', port=465)
        self.login(username, password)

    def send_message(self, *,
                     from_addr,
                     to_addrs,
                     msg,
                     subject,
                     attachments=None):
        msg_root = MIMEMultipart()
        msg_root['Subject'] = subject
        msg_root['From'] = from_addr
        msg_root['To'] = ', '.join(to_addrs)

        msg_alternative = MIMEMultipart('alternative')
        msg_root.attach(msg_alternative)
        msg_alternative.attach(MIMEText(msg))

        if attachments:
            for attachment in attachments:
                prt = MIMEBase('application', "octet-stream")
                prt.set_payload(open(attachment, "rb").read())
                encoders.encode_base64(prt)
                prt.add_header(
                    'Content-Disposition', 'attachment; filename="%s"'
                    % attachment.replace('"', ''))
                msg_root.attach(prt)

        self.sendmail(from_addr, to_addrs, msg_root.as_string())

Lines 7–12 create a subclass of smtplib.SMTP_SSL, and uses the supplied credentials to log into FastMail. Annoyingly, this subclassing is broken on Python 2, because SMTP_SSL is an old-style class, and so super() doesn’t work. I only use Python 3 these days, so that’s okay for me, but you’ll need to change that if you want a backport.

For getting my username/password into the script, I use the keyring module. It gets them from the system keychain, which feels pretty secure. My email credentials are important – I don’t just want to store them in an environment variable or a hard-coded string.

Lines 14–19 defines a convenience wrapper for sending a message. The * in the arguments list denotes the end of positional arguments – all the remaining arguments have to be called as keyword arguments. This is a new feature in Python 3, and I really like it, especially for functions with lots of arguments. It helps enforce clarity in the calling code.

In lines 20–23, I’m setting up a MIME message with my email headers. I deliberately use a multi-part MIME message so that I can add attachments later, if I want.

Then I add the body text. With MIME, you can send multiple versions of the body: a plain text and an HTML version, and the recipient’s client can choose which to display. In practice, I almost always use plaintext email, so that’s all I’ve implemented. If you want HTML, see Stack Overflow.

Then lines 29–37 add the attachments – if there are any. Note that I use None as the default value for the attachments argument, not an empty list – this is to avoid any gotchas around mutable default arguments.

Finally, on line 39, I call the sendmail method from the SMTP class, which actually dispatches the message into the aether.

The nice thing about subclassing the standard SMTP class is that I can use my wrapper class as a drop-in replacement. Like so:

with FastMailSMTP(user, pw) as server:
    server.send_message(from_addr='hello@example.org',
                        to_addrs=['jane@doe.net', 'john@smith.org'],
                        msg='Hello world from Python!',
                        subject='Sent from smtplib',
                        attachments=['myfile.txt'])

I think this is a cleaner interface to email. Mucking about with MIME messages and SMTP is a necessary evil, but I don’t always care about those details. If I’m writing a script where email support is an orthogonal feature, it’s nice to have them abstracted away.


Safely deleting a file called ‘-rf *’

Odd thing that happened at work today: we accidentally created a file called -rf * on one of our dev boxes. Linux allows almost any character in a filename, with the exception of the null byte and a slash, which lets you create unhelpfully-named files like this. (Although you can’t create -rf /.)

You have to be a bit careful deleting a file like that, because running rm -rf * usually deletes everything in the current directory. You could try quoting it – rm "-rf *", perhaps – but you have to be careful to get the quotes right.

On systems with a GUI, you can just use the graphical file manager, which doesn’t care about shell flags. But most of our boxes are accessed via SSH, and only present a command line.

Another possible fix is to rename the file to something which doesn’t look like a shell command, and delete the renamed file. But trying to do mv "-rf *" has the same quoting issues as before.

In the end, I went with an inelegant but practical solution:

$ python -c 'import shutil; shutil.move("-rf *", "deleteme")'

Python doesn’t know anything about these Unix flags, so it renames the file without complaint.

I feel like I should probably know how to quote the filename correctly to delete this without going to Python, but sometimes safety and pragmatism trump elegance. This works, and it got us out of a mildly tricky spot.

Hopefully this is not something many people need to fix, but now it’s all sorted, I can’t help but find the whole thing mildly amusing.


“The document could not be saved”

I try not to make unreasonable complaints about the quality of software. I write software for a living, and writing bug-free software is hard. A trawl through the code I’ve written would reveal many embarrassing or annoying bugs. People in glass houses, etc.

But I do have some basic expectations of the software I use. For example, I expect that my text editor should be able to open, edit, and save files.

A screenshot of a TextMate window showing a dialog "The document 'hello.txt' could not be saved.  Please check Console output for reason."

A screenshot of a TextMate window showing a dialog “The document ‘hello.txt’ could not be saved. Please check Console output for reason.”

Two days ago, TextMate on my laptop decided it didn’t fancy saving files any more. Actually, it decided that ⌘+S should do nothing, and trying to click “Save” in the File menu would throw this error dialog. A trawl of Google suggests that I’m the only person who’s ever hit this error. If so, here’s a quick write-up of what I tried for the next person to run into it.

Continue reading →


Hiding the YouTube search bar

This morning, I got an email from Sam, asking if I had a way to cover up the persistent YouTube search bar:

Three years ago, I wrote a bookmarklet for cleaning up the worst of the Google Maps interface, and we can adapt this to clean up YouTube as well. Unlike that post, this is one I’m likely to use myself. (Writing the Maps bookmarklet was a fun exercise in JavaScript, but I almost always use Google Maps on my phone, so I was never as annoyed by the clutter on the desktop version.)

If we do “Inspect Element” on a YouTube page, we can find the element that contains this search box: <div id="yt-masthead-container">. So we want to toggle the visibility of this element. Since it’s only one item, we can write a much smaller JavaScript snippet for toggling the visibility:

var search_bar = document.getElementById("yt-masthead-container");

// Check if it's already hidden
var hidden = (window.getComputedStyle(search_bar)).getPropertyValue("display");

// Set the visibility based on the opposite of the current state
void(search_bar.style.display = (hidden == "none" ? "" : "none"));

To use this code, drag this link to your bookmarks bar:

Toggle the YouTube search bar

Simply click it once to make the bar disappear, and click it again to bring it all back.

Something that wasn’t in my original Google Maps bookmarklet is that void() call. It turns out that if a bookmarklet returns a value, it’s supposed to replace the current page with that value. Which strikes me as bizarre, but that’s what Chrome does, so it broke the page. (Safari doesn’t – not sure if that’s a bug or a feature.) The void function prevents that from happening.

This isn’t perfect – content below the bar doesn’t reflow to take up the available space – but the bar no longer hangs over content as you scroll. I think I’ll find this useful when I’m pressed for space on small screens. It’s a bit more screen real-estate I can reclaim. Thanks for the idea, Sam!


Treat regular expressions as code, not magic

Regular expressions (or regexes) have a reputation for being unreadable. They provide a very powerful way to manipulate text, in a very compact syntax, but it can be tricky to work out what they’re doing. If you don’t write them carefully, you can end up with an unmaintainable monstrosity.

Some regexes are just pathological1, but the vast majority are more tractable. What matters is how they’re written. It’s not difficult to write regexes that are easy to read – and that makes them easy to edit, maintain, and test. This post has a few of my tips for making regexes that are more readable.

Here’s a non-trivial regex that we’d like to read:

MYSTERY = r'^v?([0-9]+)(\.([0-9]+)(\.([0-9]+[a-z]))?)?$'

What’s it trying to parse? Let’s break it down.

Tip 1: Split your regex over multiple lines

A common code smell is “clever” one-liners. Lots of things happen on a single line, which makes it easy to get confused and make mistakes. Since disk space is rarely at a premium (at least, not any more), it’s better to break these up across multiple lines, into simpler, more understandable statements.

Regexes are an extreme version of clever one-liners. Splitting a regex over multiple lines can highlight the natural groups, and make it easier to parse. Here’s what our regex looks like, with some newlines and indentation:

MYSTERY = (
    r'^v?'
    r'([0-9]+)'
    r'('
        r'\.([0-9]+)'
        r'('
            r'\.([0-9]+[a-z])'
        r')?'
    r')?$'
)

This is the same string, but broken into small fragments. Each fragment is much simpler than the whole, and you can start to understand what the regex is doing by analysing each fragment individually. And just as whitespace and indentation are helpful in non-regex code, here they help to convey the structure – different groups are indented to different levels.

So now we have some idea of what this regex is matching. But what was it trying to match?

Tip 2: Comment your regexes

Comments are really important for the readability of code. Good comments should explain why the code was written this way – what problem was it trying to solve?

This is helpful for many reasons. It helps us understand what the code is doing, why it might make some non-obvious choices, and helps to spot bugs. If we know what the code was supposed to do, and it does something different, we know there’s a problem. We can’t do that with uncommented code.

Regexes are a form of code, and should be commented as such. I like to have an overall comment that explains the overall purpose of the regex, as well as individual comments for the broken-down parts of the regex. Here’s what I’d write for our example:

# Regex for matching version strings of the form vXX.YY.ZZa, where
# everything except the major version XX is optional, and the final
# letter can be any character a-z.
#
# Examples: 1, v1.0, v1.0.2, v2.0.3a, 4.0.6b
VERSION_REGEX = (
    r'^v?'                          # optional leading v
    r'([0-9]+)'                     # major version number
    r'('
        r'\.([0-9]+)'               # minor version number
        r'('
            r'\.([0-9]+[a-z]?)'     # micro version number, plus
                                    # optional build character
        r')?'
    r')?$'
)

As I was writing these comments, I actually spotted a mistake in my original regex – I’d forgotten the ? for the optional final character.

With these comments, it’s easy to see exactly what the regex is doing. We can see what it’s trying to match, and jump to the part of the regex which matches a particular component. This makes it easier to do small tweaks, because you can go straight to the fragment which controls the existing behaviour.

So now we can read the regex. How do we get information out of it?

Tip 3: Use non-capturing groups.

The parentheses throughout my regex are groups. These are useful for organising and parsing information from a matching string. In this example:

  • The groups for minor and micro version numbers are followed by a ? – the dot and the associated number are both optional. Putting them both in a group, and making them optional together, means that v2 is a valid match, but v2. isn’t.

  • There’s a group for each component of the version string, so I can get them out later. For example, given v2.0.3b, it can tell us that the major version is 2, the minor version is 0, and the micro version is 3b.

In Python, we can look up the value of these groups with the .groups() method, like so:

>>> import re
>>> m = re.match(VERSION_REGEX, "v2.0.3b")
>>> m.groups()
('2', '.0.3b', '0', '.3b', '3b')

Hmm.

We can see the values we want, but there are a couple of extras. We could just code around them, but it would be better if the regex only captured interesting values.

If you start a group with (?:, it becomes a non-capturing group. We can still use it to organise the regex, but the value isn’t saved.

I’ve changed two groups to be non-capturing in our example:

# Regex for matching version strings of the form vXX.YY.ZZa, where
# everything except the major version XX is optional, and the final
# letter can be any character a-z.
#
# Examples: 1, v1.0, v1.0.2, v2.0.3a, 4.0.6b
NON_CAPTURING_VERSION_REGEX = (
    r'^v?'                          # optional leading v
    r'([0-9]+)'                     # major version number
    r'(?:'
        r'\.([0-9]+)'               # minor version number
        r'(?:'
            r'\.([0-9]+[a-z]?)'     # micro version number, plus
                                    # optional build character
        r')?'
    r')?$'
)

Now when we extract the group values, we’ll only get the components that we’re interested in:

>>> m = re.match(NON_CAPTURING_VERSION_REGEX, "v2.0.3b")
>>> m.groups()
('2', '0', '3b')
>>> m.group(2)
'0'

Now we’ve cut out the noise, and we can access the interesting values of the regex. Let’s go one step further.

Tip 4: Always use named capturing groups

What does m.group(2) mean? It’s not very obvious, unless I have the regex that m was matching against. When reading code, it can be difficult to know what the value of a capturing group means.

And suppose I later change the regex, and insert a new capturing group before the end. I now have to renumber anywhere I was getting groups with the old numbering scheme. That’s incredibly fragile.

There’s a reason we use text, not numbers, to name variables in our programs. If a variable has a descriptive name, the code is much easier to read, because we know what the variable “means”. And when we’re writing code, we’re much less likely to get variables confused.

The same logic should apply to regexes.

Many regex parsers now support named capturing groups. You can supply an alternative name for looking up the value of a group. In Python, the syntax is (?P<name>...) – it varies slightly from language to language.

If we add named groups to our expression:

# Regex for matching version strings of the form vXX.YY.ZZa, where
# everything except the major version XX is optional, and the final
# letter can be any character a-z.
#
# Examples: 1, v1.0, v1.0.2, v2.0.3a, 4.0.6b
NAMED_CAPTURING_VERSION_REGEX = (
    r'^v?'                                # optional leading v
    r'(?P<major>[0-9]+)'                  # major version number
    r'(?:'
        r'\.(?P<minor>[0-9]+)'            # minor version number
        r'(?:'
            r'\.(?P<micro>[0-9]+[a-z]?)'  # micro version number, plus
                                          # optional build character
        r')?'
    r')?$'
)

We can now look up the attributes by name, or indeed access the entire collection with the groupdict attributed.

>>> m = re.match(NAMED_CAPTURING_VERSION_REGEX, "v2.0.3b")
>>> m.groups()
('2', '0', '3b')
>>> m.group('minor')
'0'
>>> m.groupdict()
{'major': '2', 'micro': '3b', 'minor': '0'}

If I look up a group with m.group('minor'), it’s much clearer what it means. And if the underlying regex ever changes, the lookup is fine as-is. Named capturing groups make our code much more explicit and robust.

Conclusion

The tips I’ve suggested – significant whitespace, comments, using descriptive names – are useful, but they’re hardly revolutionary. These are all hallmarks of good code.

Regexes are often allowed to bypass the usual metrics of code quality. They sit as black boxes in the middle of a codebase, monolithic strings that look complicated and scary. If you treat regexes as code, rather than magic, you end up breaking them down, and making them more readable. The result is always an improvement.

Regexes don’t have to be scary. Just treat them as another piece of code.


  1. Validating email addresses is a problem that you probably shouldn’t try to solve with regexes. Usually you want to know that the user has access to the address, not just that it’s correctly formatted. To check that, you need to actually send them an email – which ensures it’s valid at the same time. ↩︎


← Older Posts