This series of posts will explore how we can use the HTML canvas element to build a simple browser-based drawing tool. In this final post we look at allowing users to add text annotations to the canvas.


Welcome to the fourth (and final) post in this series, the goal of which is to build a canvas-based drawing tool from scratch with zero dependencies. The tool should allow the user to upload an existing image and to embellish it with free-hand drawing, fixed shapes and text annotations before exporting the image again. This tool is being built for instructional purposes, but for convenience I have made it available here, if you would like to have a play around.

If you intend to follow the tutorial for yourself I would encourage you to grab the source code from the GitHub repo. The version of the code we build today will extend upon what was covered in the previous tutorials of this seres: Part I, Part II and Part III. Take a look back over these earlier tutorials if you come across any concepts or constructs that look unfamiliar.

In this post we will look at implementing functionality which allows the user to add text captions to their canvas.

Let's recap what we were planning to implement and where we are, currently:

  1. Draw free hand lines on the canvas (Part I)
  2. Draw resizable rectangles to highlight a section of the image (Part II)
  3. Set a background image on the canvas that we can annotate (Part III)
  4. Add text captions

The source code for the version of the tool that we build in this tutorial can be found on this branch of the GitHub repo, and this video briefly demonstrates the functionality we hope to build here:

Page markup and initial setup

The markup from the previous tutorials will, once again, need to be extended to include a few more buttons to allow the user to add and edit text captions. As part of this work we have reorganised the page layout, however we will focus only on the new elements relating to our text-input functionality. A relevant extract from index.html is shown here:

        <div id="tools" class="control_panel_section">
          <div class="control_option">
            <button class="btn tool-btn with-context-menu" id="draw_tool_btn" data-active="false" data-target="pencil-controls">✎</button>
            <button class="btn tool-btn" id="erase_tool_btn" data-active="false"><div class="erase_rect">▭</div></button>
            <button class="btn tool-btn with-context-menu" id="rect_tool_btn" data-active="false" data-target="shape-controls">⊞</button>
            <button class="btn tool-btn" id="selector_tool_btn" data-active="false"><div class="hand_pointer">☞</div></button>
            <button class="btn tool-btn with-context-menu" id="text_tool_btn" data-active="false" data-target="text-controls">T</button>

          <div id="context-menu">
            <!-- Pencil controls -->
            <div id="pencil-controls" class="control_option" style="display: none">

            <!-- Rectangle controls -->
            <div id="shape-controls" class="control_option" style="display: none">

            <!-- Text controls -->
            <div id="text-controls" class="control_option" style="display: none">
                <select name="font_size" id="font_size">
                  <option value="10">XS</option>
                  <option value="12">S</option>
                  <option value="16" selected="">M</option>
                  <option value="22">L</option>
                  <option value="32">XL</option>
                <select name="font_colour" id="font_colour">
                  <option value="black" selected="">Black</option>
                  <option value="white">White</option>
                  <option value="red">Red</option>
                  <option value="green">Green</option>
                  <option value="blue">Blue</option>
                <select name="background_colour" id="background_colour">
                  <option value="white" selected="">White</option>
                  <option value="black">Black</option>
                  <option value="transparent">Transparent</option>
      <script src="js/page.js"></script>
      <script src="js/image.js"></script>
      <script src="js/eraser.js"></script>
      <script src="js/pencil.js"></script>
      <script src="js/rectangle.js"></script>
      <script src="js/shape.js"></script>
      <script src="js/text_box.js"></script>
      <script src="js/text.js"></script>

We have added a new tool button, text_tool_btn, which will activate the text-input tool. We have also added a few input elements within the text-controls context menu. These new input elements should allow the user to alter the font-size, font-colour and background-colour of the text caption.

Asides from the new HTML elements on the page, you can see that we are now loading two additional scripts from js/text_box.js and js/text.js. The script at js/text.js will declare a TEXT property on the global window object, and the top-level page initialization script (at js/page.js) will initialize this TEXT module along with the previously-implemented modules:

  page.init = function(canvas_id){
    // Initialize canvas size
    page.canvas = document.getElementById(canvas_id);
    page.canvas.width = window.getComputedStyle(page.canvas, null)
      .replace(/px$/, '');
    page.canvas.height = window.getComputedStyle(page.canvas, null)
      .replace(/px$/, '');
    page.ctx = this.canvas.getContext('2d');

    if(typeof window.PENCIL !== "undefined"){
    if(typeof window.IMAGE !== "undefined"){
    if(typeof window.ERASER !== "undefined"){
    if(typeof window.SHAPE !== "undefined"){
    if(typeof window.TEXT !== "undefined"){

With this bit of bootstrapping carried out we can now take a look at what the TEXT module actually does when we initialize it.

Initializing text-input behaviours

As with our previous modules, the TEXT module is defined using the revealing module pattern. When we call the init method on this module we are doing the following:

  let ctx = null,
    canvas = null,
    p = null,
    all_texts = [],
    font_colour = "black",
    font_size = 16,
    background_colour = "white";


  text.init = function(context){
    ctx = context;
    canvas = context.canvas;

    // Initialize touch point state
    p = new Point({ x: 0, y: 0, canvas: canvas })


    document.getElementById("text_tool_btn").addEventListener("click", throttle(function(e){
      const $target =".tool-btn"),
        active = ($"true");
      $ = !active;
      PAGE.toggle_context_menu($target, !active);
    }, 50));


   const init_text_style_handlers = function(){
    const $font_size_select = document.getElementById("font_size"),
      $font_colour_select = document.getElementById("font_colour"),
      $background_colour_select = document.getElementById("background_colour");

    $font_size_select.addEventListener("change", function(e){
      font_size = parseInt(, 10);
          text.font_size = font_size;

At the top-level of the module you can see that we maintain the state of the text-input through a number of variables. The font_size, font_colour and background_colour are all intialized with default values and we also maintain an all_texts array, which is initially empty. We will use this array to store a reference to each text caption we place on our canvas. Within the init method we initialize the reference to the canvas element and its associated rendering context (ctx) which is passed as an argument when the init method is invoked. Also within the init method we build a new Point object, which is used to convert the location of our user interactions to and from the canvas coordinates. As part of the initialization we also invoke init_text_style_handlers, which is simply responsible for binding to change events on the font-style, font-colour and background-colour select tags. For example:

   const init_text_style_handlers = function(){
    const $font_size_select = document.getElementById("font_size"),
      $font_colour_select = document.getElementById("font_colour"),
      $background_colour_select = document.getElementById("background_colour");

    $font_size_select.addEventListener("change", function(e){
      font_size = parseInt(, 10);
          text.font_size = font_size;

In the code snippet above we limit consideration to handling font-size changes. The change event on the select element is used to set the corresponding module-level state (font_size in this case). We then loop over the all_texts array and redraw with the new font-size for any text-element that is flaggeded as currently active.

The final component in the TEXT.init function is setting up the click-handler on the text_tool_btn. Clicking this button will execute the following:

  1. Toggle the data-active attribute on the button, which tracks whether the tool is active/inactive
  2. Invoke the toggle_text_handlers function, which we will examine shortly
  3. Delegate to the PAGE module to show (or hide) the context menu associated with text input
Point 3 is simply hides or reveals the context menu which includes our font-size, font-colour and background-colour inputs. More interesting is the setting up of the canvas event handlers, achieved by toggle_text_handlers:

  const toggle_text_handlers = function(on) {
    const method = on ? canvas.addEventListener : canvas.removeEventListener;, 'mouseup', mouseup);, 'touchend', touchend);

The toggle_text_handlers is used to either add handlers to, or remove handlers from the canvas element. Whether we are adding or removing is controlled by the boolean value of the on argument. The canvas element will listen to mouseup and touchend events, but they are basically doing the same thing, so let's focus on the handling of touchend events:

  const touchend = function(event){
    if (event.changedTouches.length == 1) {
      const touch = event.changedTouches[0];
      p.x = touch.pageX;
      p.y = touch.pageY;

We first verify that we are dealing with a single touch point, then we capture the pageX and pageY coordinates of this touch event in our Point object (p), which we then pass to the activate_or_create_text_box function:

  const activate_or_create_text_box = function(point){
    for(let i=0, len=all_texts.length; i<len; i++){
    all_texts.push(new TextBox({
      x: point.canvas_x,
      y: point.canvas_y,
      canvas: canvas,
      font_colour: font_colour,
      font_size: font_size,
      background_colour: background_colour

This function will loop over our all_texts array to determine if the Point of interaction coincides with an existing text input. If it does, we activate that existing text input. Otherwise we create a new TextBox object and add that to our all_texts array. The TextBox class is a custom class which we use to represent these editable text captions, we will shortly examine how these TextBox objects work, but first we introduce a couple of utilities that will be needed.


With no libraries to lean on we will need to add a debounce method. The implementation looks like this:

window.debounce = (callback, wait) => {
  let timeoutId = null;
  return (...args) => {
    timeoutId = window.setTimeout(() => {
      callback.apply(null, args);
    }, wait);

Wrapping the callback in debounce ensures that the callback is only invoked after a delay of wait has elapsed without callback having been triggered. This behaviour is very useful when multiple events are fired rapidly in quick succession, but we only want to trigger the callback when the events stop. In our case, when the user is typing into the text input we don't really want to react on each input, rather we want to react once the user has finished. In this case, we interpret a long pause as indicating that the user has finished typing.

Another utility which we will require is the CanvasScaler. As we saw previously, we have two sets of coordinates that we need to map between: the (x,y) viewport coordinates associated with our touch and mouse events, and the (canvas_x, canvas_y) coordinates that we use to animate on our canvas. The CanvasScaler class exposes methods which allow us to convert horizontal and vertical lengths in the viewport basis, over to the canvas basis, and vice-versa.

class CanvasScaler {
    this.canvas = canvas;
    this.css_width = window.getComputedStyle(canvas, null)
      .replace(/px$/, '');
    this.css_height = window.getComputedStyle(canvas, null)
      .replace(/px$/, '');

    return length*this.canvas.width/this.css_width;

    return length*this.canvas.height/this.css_height;

    return length*this.css_width/this.canvas.width;

    return length*this.css_height/this.canvas.height;

With these utilities we can jump into examining the TextBox class.

The TextBox class

If you took a look at the video at the top of the post you would see the behaviour that we are aiming to implement here. When the text-input tool is activated we want to be able to click at a point on the canvas and create a text-input box, where we enter our text. We press return and the text is rendered in a box, if we re-click on the box we should be able to edit the text. Once the text box has been rendered we should be able to activate our selection tool box on the canvas.

Given that our text-box is a rectangle and that we want to be able to drag and reposition it on the canvas, it should not be surprising that we we reuse one of existing Rectangle classes, as introduced in Part II to give us some of this functionality. Our new TextBox will extend the DraggableRectangle class:

class TextBox extends DraggableRectangle {
  constructor({x, y, width, height, canvas, font_colour, font_size, background_colour}={}) {
    super({x, y, width, height, canvas});
    this.font_colour = font_colour;
    this.font_size = font_size;
    this.background_colour = background_colour;
    this.scaler = new CanvasScaler(canvas);
    this.text_box = null;
    this.text = "";
    this.text_box = this._add_textbox();
    this.input = this.text_box.querySelector("[contenteditable]");
    this.padding = 0.25*Number(window.getComputedStyle(document.body).getPropertyValue('font-size').match(/\d+/)[0]);

When building an instance of TextBox we must pass the usual parameters associated with our Rectangle class, namely, the (x,y) viewport co-ordinates of the rectangle, dimensions for the rectangle width and height and the canvas element on which we want to overlay the rectangle. These attributes are relayed to the base class via the call to super. We also note here, that within this constructor we will call the this.draw() method on the TextBox. This will return the current instance, which we then pass over to the SHAPE module via the add_rectangle call. We can see from the implementation in the SHAPE module that this method will simly add our new TextBox instance to the all_rectangles array:

  s.add_rectangle = function(rect){

By adding our DraggableRectangle subclass to this array, we can leverage the select-to-drag functionality that we implemented for our rectangles in Part II.

In addition to the properties relating to the drawing of the geometric shape, the TextBox instance will also store attributes relating to the text: font_colour, font_size and background_colour. And the actual text value, this.text, will be initialized to a blank string.

The TextBox will also maintain its own instance of the CanvasScaler class we introduced in the previous section; this scaler will be stored on the this.scaler property for convenience. on the TextBox instance,

The last major function of this constructor is to setup the actual input element for collecting the user's text input. This is achieved by the this._add_textbox method, with a reference stored in this.text_box property. Before concluding, the constructor will cache a couple of values for convenient use later: this.input holds a reference to the actual contenteditable input field, while this.padding caches a calculated padding value for the input element, based on the document font-size. Most of the interesting logic is captured in the _add_textbox method , so we will focus on this method in the next section.

The _add_textbox method

This method is called within the TextBox constructor to create the actual text input that the user will interact with. It will dynamically build a $wrapper div containing a $input element and a $close_btn element. The $wrapper element will be absolutely positioned on the page using the (x, y) viewport coordinates passed to the TextBox constructor (which, in turn, come from the user's touchend or mouseup interaction with the canvas).

The $input element is set to be contenteditable and we apply a series of styles to this element. Importantly we set the width and height of the input element to match the rectangle dimensions, but this requires the use of our CanvasScaler to convert the rectangle dimensions form the canvas-scale into the viewport-scale.

Finally we stitch together the different parts of the $wrapper element, append it to the document and set browser focus on the contenteditable $input element, awaiting the user's input.

    // Add an absolutely positioned text input
    const $wrapper = document.createElement("div"),
      $input = document.createElement("span"),
      $close_btn = document.createElement("span");
    $close_btn.innerText = "X";

    $ = "absolute";
    $ = this.top_left().y+"px";
    $ = this.top_left().x+"px";

    $input.setAttribute("contenteditable", true);
    $ = "absolute";
    $ = "0px";
    $ = "0px";
    $ = "auto";
    $ = this.scaler.scale_from_canvas_vertical(this.height)+"px";
    $ = this.scaler.scale_from_canvas_horizontal(this.width)+"px";
    $input.role = "textbox";
    $ = this.font_colour;
    $ = this.font_size + "px";
    $ = this.background_colour;

    $input.addEventListener("input", debounce(this.draw.bind(this), 1000), false);
    $input.addEventListener("input", this._resize_box.bind(this), false);
    $close_btn.addEventListener("click", this.destroy.bind(this), false);
    return $wrapper;

At the end of the _add_textbox method you can see that we attach a number of listeners to handle the user input. The click-handler on the $close_btn will simply destroy this TextBox and its associated DOM element. We also add two listeners to the input event on the $input element. The draw handler is debounced using the utility function we introduced earlier. This listener is intended to render the text caption when the user has finished typing, which we infer from a pause of 1000ms.

The draw function (shown below) initializes the canvas if this has not already happened. As a quick reminder, each Rectangle instance creates a completely new canvas element in the DOM which is overlayed over the original canvas and inherits some of its attributes. With the new canvas set up the super.draw is invoked, setting the fill parameter provided the background-colour has been set to any non-transparent colour. This step will draw the background rectangle for our text caption, but the actual text is rendered by means of the _draw_text invocation. This function will write text to the canvas using the native fillText method, taking care to set the correct position, color and size for the text. In particular, determining the exact canvas position for the text requires a little thought, as we want this text to precisely match the position of the content added to the contenteditable $input element. To calculate this we need to incorporate the top-left of the containing Rectangle along with the font-size and padding, as shown:

    $input.addEventListener("input", debounce(this.draw.bind(this), 1000), false);


    const orig_colour = this.ctx.fillStyle;
    this.ctx.fillStyle = this.background_colour;
    super.draw({fill: this.background_colour!=="transparent"});
    this.ctx.fillStyle = orig_colour;
    return this;


    const point = new Point({
        x: this.top_left().x + this.padding,
        y: this.top_left().y + this.font_size + this.padding,
        canvas: this.canvas
    this.with_fill_colour(this.font_colour, function(){
      this.ctx.font = this.font_size + "px sans-serif";
      this.ctx.fillText(this.input.textContent, point.canvas_x, point.canvas_y);
    }.bind(this)); = "none";

The draw handler is only triggered after a pause in the user input. By contrast the _resize_box handler will fire on each input event:

    $input.addEventListener("input", this._resize_box.bind(this), false);


    const old_width = this.width,
      old_height = this.height;

    this.width = this.scaler.scale_to_canvas_horizontal(this.input.clientWidth + 2*this.padding);
    this.height = this.scaler.scale_to_canvas_vertical(this.input.clientHeight);
    this.x = this.x + (this.width-old_width)/2;
    this.y = this.y + (this.height-old_height)/2;
    if(event.inputType==="insertText" &&{

This function will read the new clientWidth and clientHeight of our contenteditable input element. It will then use the change in width and height to recalculate the position and width of our rectangle in canvas coordinates. This ensures the bounding Rectangle expands automatically to accommodate the text typed by the user. Conveniently, in this regard, our CSS for the contenteditable element ensures that it's width will resize appropriately to accommodate the user input:

.text-wrapper [contenteditable] {
  display: inline-block;
  width: auto;
  white-space: nowrap;

With these pieces stitched together we get the behaviour we were aiming for.


In this tutorial we have demonstrated how we can implement functionality which allows a user to add text captions to our drawing canvas. This required that we introduce a new TextBox class which extended the previously-implemented DraggableRectangle. The new TextBox had to extend the draw method to, not only draw the rectangle on the canvas, but also draw the user-entered text within the rectangle, using the native fillText method. The TextBox instance also needed to manage an absolutely positioned contenteditable element which was revealed or hidden depending upon whether the user was actively inputting/editing the text caption.

So that's it, the last tutorial in this four-part series. I hope you found it useful. If you have been here from the start … Congratulations, I seriously didn't expect anyone to stick with the whole thing! I could barely motivate myself to write it :)

As always, if you have any questions or feedback please let me know in the comments section.


  1. You can access a hosted version of the drawing tool here
  2. The GitHub repo to accompany this series of blog posts
  3. The version of the tool built in this tutorial can be found on this branch
  4. The revealing module pattern for modular Javascript
  5. The MDN docs for fillText method.


There are no existing comments

Got your own view or feedback? Share it with us below …