Sunday 21 August 2016

GSOC 2016- Complete picture of “Integrate Google Cloud Vision API to Drupal 8” project- Final Submission

TL;DR The Google Summer of Code period ends, and am glad that I am able to meet all the goals and develop something productive for the Drupal community. In this blog post, I will be sharing the details of the project, the functionality of the module and its current status.


I am glad that I was one of the lucky students who were selected to be a part of the Google Summer of Code 2016 program for the project “Integrate Google Cloud Vision API to Drupal 8”. The project was under the mentorship of Naveen Valecha, Christian López Espínola and Eugene Ilyin. Under their mentoring and guidance, I am able meet all the goals and develop something productive for the Drupal community.


Let me first share why the Google Vision API module may be required.


Google Cloud Vision API bring to picture the automated content analysis of the images. The API can not only detect objects ranging from animals to famous monuments, but also detects faces on emotions. In addition, the API can also help censor images, extract text from images, detect logos and landmarks, and even the attributes of the image itself, for instance the dominant color in the image. Thus, it can serve as a powerful content analysis tool for images.


Now let us see how can we put the module to use, i.e. what are its use cases. To start with, the Google Vision API module allows Taxonomy tagging of image files using Label Detection.
Label Detection classifies the images into a number of general purpose categories. For example, classifying a war scenario to war, troop, soldiers, transport, etc. based on the surroundings in the images. This feature of the module is especially important to filter the images based on some tags.


Second feature listing our use case is the Safe Search Detection. It quickly identifies and detects the presence of any explicit or violent contents in an image which are not fit for display.
When this feature is enabled in the module, the Safe Search technique validates any image for explicit/violent contents. If found, these images are asked for moderation, and are not allowed to be uploaded on the site, thus keeping the site clean.


Please click here for video demonstration of the two above-mentioned use cases.


Continuing with the other use cases, the third one is Filling the Alternate Text field of an image file.
Label, Logo, Landmark and Optical Character Detection feature of the Google Cloud Vision API have been used to implement this use case. Based on the choice entered by the end user, he/she can have the Alternate Text for any image auto filled by one of the four above-mentioned options. The choice “Label Detection” would fill the field with the first value returned in the API response. “Logo Detection” identifies the logos of famous brands, and can be used to fill the field accordingly. Likewise, “Landmark Detection” identifies the monuments and structures, ranging from natural to man-made; and “Optical Character Detection” detects and identifies the texts within an image, and fills the Alternate Text field accordingly.


Next comes the User Emotion Detection feature.
This feature is especially important in cases of new account creation. On enabling this feature, it would detect the emotion of the user in the profile picture and notify the new user if he/she seems to be unhappy in the image, prompting them to upload a happy one.


Lastly, the module also allows Displaying the similar image files.
Based on the dominant color component (Red, Green or Blue), the module quickly groups all the images which share the same color component, and display them under the “Similar Content” tab in the form of a list. Each item links itself to the image file itself, and is named as per the filename saved by the user.
Users should note here that by “similar contents”, we do not mean that the images would resemble each other always. Instead we mean here the same dominant color components.


All the details of my work, the interesting facts and features have been shared on the Drupal Planet.


Please watch this video to know more on how to use the above-mentioned use cases in proper way.





This is the complete picture of the Google Vision API module developed during the Google Summer of Code phase (May 23, 2016- August 23, 2016).


With this, the three wonderful months of Google Summer of Code phase comes to an end, enriching me with lots of experiences, meeting great persons and working with them. In addition of giving me an asset, it also boosted and enhanced my skills. I learnt a lot of new techniques, which probably, I would not have learnt otherwise. The use of services and dependency injection, constraints and validators, controllers, automated tests and the introduction to concepts of entities and entity types to name a few.

I would put to use these concepts in best possible way, and try to contribute to the Drupal community with my best efforts.

Saturday 20 August 2016

GSOC 2016- Work Product of Google Vision API project- Final Evaluation


TL;DR In this blog post, I will be sharing the details of the project, the tasks which I have completed as a part of Google Summer of Code 2016, along with the patches and issue details.


First of all, let me share the link to all my code patches and contributions. These have been listed under the commit log of the Google Vision API module.

Let me now share what my project is based on, and what were the tasks I had proposed to complete.

Google Cloud Vision API bring to picture the automated content analysis of the images. The API can not only detect objects ranging from animals to famous monuments, but also detects faces on emotions. In addition, the API can also help censor images, extract text from images, detect logos and landmarks, and even the attributes of the image itself, for instance the dominant color in the image.
All the features which I had proposed to implement are listed below:
  1. Integrate the Label Detection feature with the image field.
  2. Integrate the Landmark Detection​ with the image field.
  3. Integrate the Logo Detection​ with the image field.
  4. Integrate the Explicit Content Detection ​with the image field.
  5. Integrate the Optical Character Recognition​ with the image field.
  6. Integrate the Face Detection​ with the image field.
  7. Integrate the Image Attributes ​with the image field.

On discussion with my mentors, we had decided the following use cases to implement the above proposed features. We had put to use these features in the following way:
  1. Use the Label, Landmark, Logo and Optical Character Detection to fill the Alternate Text field of the image files uploaded by the user.
  2. Use the Explicit Content Detection feature to identify and detect any explicit or violent content present in the images, and prevent the uploading of such content.
  3. Use the Face Detection feature to detect the emotions of the users in their profile pictures, and notify the users if they seem to be unhappy.
  4. Use the Image Attributes feature to detect the dominant color in the image, and group the images files on its basis.

In addition to these implementations, I worked on developing tests to test the functionality of the module, implementing the important concepts of Drupal 8, such as, use of services and containers and use of abstract parent classes for the tests.

I made the contributions to Drupal in the form of a module under the name Google Vision API.
All the contributions were made under the guidance and surveillance of my mentors, Naveen Valecha, Christian López Espínola and Eugene Ilyin and have been committed to the module only when they permitted.
And here is the link to my Drupal.org profile: https://www.drupal.org/u/ajalan065.

In order to share the weekly details of the project with the Drupal community, I maintained blog posts on Drupal Planet, where I shared my work experiences, the tasks which I have accomplished, the issues or the problems I faced along with the solutions. Please click here to read all the blog posts.

This is the complete picture of my codes and contributions during the Google Summer of Code period (May 23, 2016- August 23, 2016).

Tuesday 16 August 2016

GSOC 2016- Making Label Detection results configurable and improving documentation- Week 12

TL;DR Last week I had worked moving the helper functions for filling Alt Text of image file to a new service; and moving the reused/supporting functions of the tests to an abstract parent class, GoogleVisionTestBase. This week I have worked on improving the documentation of the module and making the label detection results configurable.


With all major issues and features committed to the module, this week I worked on few minor issues, including the documentation and cleanup in the project..


It is an immense pleasure for me that I am getting the feedbacks from the community on the Google Vision API module. An issue Improve documentation for helper functions was created to develop more on documentation and provide the minute details on the code. I have worked on it, and added more documentation to the helper functions so that they can be understood better.


In addition, a need was felt to let the number of results obtained from the Vision API for each of the feature as configurable, and allow the end user to take the control on that. The corresponding issue is Make max results for Label Detection configurable. In my humble opinion, most of the feature implementations and requests to the Google Cloud Vision API have nothing to do with allowing the end user to configure the number of results. For instance, the Safe Search Detection feature detects and avoids the explicit contents to be uploaded, and does not need the number of results to be configurable. However, the taxonomy tagging using Label Detection should be user dependent, and hence, I worked on the issue to make the value configurable only for Label Detection purpose. This value can be configured from the Google Vision settings page, where we set the API key. I have also developed simple web tests to verify that the value is configurable. Presently, the issue is under review.


I have also worked on standard coding fixes and pa-reviews and assisted my mentor, Naveen Valecha to develop interfaces for the services. I assisted him on access rights of the functions, and fixing the documentation issues which clashed with the present one.


Lastly, I worked on improving the README and the module page to include all the new information and instructions implemented during the Google Summer of Code phase.


With all these works done, and all the minor issues resolved, I believe that the module is ready for usage with all the features and end user cases implemented.

Next Week, I’ll work on creating a video demonstration on how to use Google Vision API to fill the Alt Text attribute of an image file, detect the emotion in the user profile pictures and to group the similar images which share the same dominant color.

Tuesday 9 August 2016

GSOC 2016- Moving supporting functions to services and abstract parent classes- Week 11

TL;DR Last week I had worked on modifying the tests for “Fill Alt Text”, “Emotion Detection” and “Image Properties” features of the Google Vision API module. The only tasks left are moving the supporting functions to a separate service, in addition to, creating an abstract parent class for tests and moving the functions there.


The issues Alt Text field gets properly filled using various detection features, Emotion Detection(Face Detection) feature and Implementation of Image Properties feature of the Google Vision API module are still under review by my mentors. Meanwhile, my mentors asked me to move the supporting functions of the “Fill Alt Text” issue to a separate service and use it from there. In addition, they also suggested me to create an abstract parent class for the Google Vision simple tests, and move the supporting functions to the parent class. Thus, this week, I contributed to follow these suggestions and implement them out.


There are few supporting functions, namely, google_vision_set_alt_text() and google_vision_edit_alt_text() to fill the Alt Text in accordance to the feature requested from the Vision API, and also to manipulate the value, if needed. I moved these functions to a separate service, namely, FillAltText, and have altered the code to use the functions from there instead of directly accessing them.


In addition, there are a number of supporting functions used in the simple web tests of the module, to create users, contents and fields, which were placed in the test file itself, which in one way, is a kind of redundancy. Hence, I moved all these supporting functions to abstract parent class named GoogleVisionTestBase, and altered the test classes to extend the parent class instead and in place of WebTestBase. This removed the redundant code, as well as, gave a proper structure and orientation to the web tests.

These minor changes would be committed to the module directly, once the major issues are reviewed by my mentors and committed to the module.

Wednesday 3 August 2016

GSOC 2016- Modifying tests for “Fill Alt Text”, “Emotion Detection” and “Image Properties” features of Google Vision module- Week 10

TL;DR Last week, I had worked on and developed tests to ensure that the Alt Text field of an image file gets filled in accordance to the various detection features of the Vision API, namely Label Detection, Landmark Detection, Logo Detection and Optical Character Detection. This week I have worked to modify and add tests to various features of the Google Vision module, namely filling of Alt Text field, emotion detection of user pictures and grouping the image files on the basis of their dominant color component.


My mentors reviewed the code and the tests which I had put for review to get them committed to the Google Vision API module. However, the code needs some amendment pointed out by my mentors, which was to be corrected before commit. Hence, I spent this week working on the issues and resolving the flaws, rather than starting with a new feature.
Let me start discussing my work in detail.


I had submitted the code and the tests which ensure that the Alt Text field gets properly filled using various detection features according to the end user choice. However, as was pointed out by my mentor, it had one drawback- the user would not be able to manipulate or change the value of the field if he wishes to. Amidst the different options available to the end user to fill the alt text field of the image file, there was a small bug- once an option is selected, it was possible to switch between the options, however, disabling it was not working. After, been pointed out, I worked on modifying the feature and introducing the end user ability to manipulate the value of the field as and when required. Also, I worked on the second bug, and resolved the issues of disabling the feature.


Regarding the Emotion Detection(Face Detection) feature of the Vision API, I was guided to use injections instead of using the static methods directly, and to modify variables. For example, the use of get(‘entity_type.manager’) over the static call \Drupal::entityTypeManager(). Apart from these minor changes, a major issue was the feature was being called whenever an image file is associated with. However, I need to direct it to focus only when the user uploads an image, and not on its removal (as both the actions involves an image file, hence the bug).


In the issue, Implementation of Image Properties feature in the Vision API, I had queried multiple times to the database in the cycle to fetch results and build the routed page using the controllers. However, my mentor instructed me that its really a bad way of implementing the database queries to fetch the results. Hence, I modified the code and changed them to single queries to fetch the result and use them to build the page. In addition, I was asked to build the list using ‘item_list’ instead of using the conventional ‘#prefix’ and ‘#suffix’ to generate the list. Another important change in my approach towards my code was the use of db_query(), the use of which is deprecated. Hence, I switched to use addExpressions() instead of db_query().

Presently, the code is under review by the mentors. I will work further on them, once they get reviewed and I get further instructions on it.

Wednesday 27 July 2016

GSOC 2016- Developing tests for “Fil Alt Text” feature of Google Vision module- Week 9

TL;DR Last week, I had worked on and developed tests to ensure that the similar images are grouped in accordance to the Image Properties feature of the Vision API. The code is under review by the mentors, and I would continue on it once the review is done. Meanwhile, they also reviewed the “Fill Alt Text” feature issue, and approved it is good to go. This week, I have worked on developing tests for this issue.


An important feature that I have implemented in the Google Vision API module is the filling of Alt Text field of an image file entity by any of the four choices- Label Detection, Landmark Detection, Logo Detection and Optical Character Detection. My mentor suggested me to check the availability of the response and then fill the field, as we can not fully rely on the third party responses. With this minor suggestion being implemented, now its time to develop tests to ensure the functionality of this feature.


I started developing simple web tests for this feature, to ensure that the Alt Text field is properly filled in accordance to the choice of the user. It requires the selection of the four choices one by one and verify that the field is filled correctly. Thus we require four tests to test the entire functionality. I have added an extra test to ensure that if none of the options are selected then the field remains empty.


I created the image files using the images available in the simpletests. The images can be accessed through drupalGetTestFiles(). The filling, however, requires call to the Google Cloud Vision API, thus inducing dependency on the API key. To remove the dependency, I mocked the function in the test module, returning the custom data to implement the feature.


The first test ensures that the Label Detection feature returns correct response and the Alt Text field is filled correctly. The simpletest provides a list of assertions to verify it, however, I found assertFieldByName() to be most suitable for the purpose. It asserts the value of a field based on the field name. The second test ensures that the Landmark Detection feature works correctly. Similarly, the third and fourth test ensures the correct functionality of the Logo and the Optical Character Detection feature.


The fifth test which I have included perform tests when none of the options are selected. It ensures that under this case, the Alt Text field remains empty, and does not contain any unwanted values.

I have posted the patch covering the suggestions and tests on the issue queue Fill the Alt Text of the Image File using Google Vision API to be reviewed by my mentors. Once they review it, I would work on it further, if required.

Tuesday 26 July 2016

GSOC 2016- Developing tests for Image Properties feature of Google Vision module- Week 8

TL;DR In the past two weeks I had worked on using the Image Properties feature offered by the Google Cloud Vision API to group the image files together on the basis of the dominant color components filling them. In addition, I had also worked on detecting the image files and filling the Alternate Text field based on the results of Label/Landmark/Logo/Optical Character Detection, based on the demand of the end user. This week, I have worked on and developed tests to ensure that the similar images are grouped in accordance to the Image Properties feature of the Vision API.


At present, the Google Vision API module supports the Label Detection feature to be used as taxonomy terms, the Safe Search Detection feature to avoid displaying any explicit contents or violence and the User Emotion detection to detect the emotions of the users in their profile pictures and notify them about it.


I had worked on grouping the images on the basis of the dominant color component(Red, Green or Blue) which they are comprised of. I got the code reviewed by my mentors, and they approved it with minor suggestions on injecting the constructors wherever possible.
Following their suggestions, I injected the Connection object instead of accessing the database via \Drupal::database().


After making changes as per the suggestions, I started developing simple web tests for this feature, to ensure that the similar images gets displayed under the SImilarContents tab. It requires the creation of new taxonomy vocabulary and adding an entity reference field to the image file entity. After the creation of the new Vocabulary and addition of the new field to the image file, I created the image files using the images available in the simpletests. The images can be accessed through drupalGetTestFiles(). The first test ensures that if the Vocabulary named ‘Dominant Color’ is selected, the similar images gets displayed under the file/{file_id}/similarcontent link.


The grouping, however, requires call to the Google Cloud Vision API, thus inducing dependency on the API key. To remove the dependency, I mocked the function in the test module, returning the custom data to implement the grouping.


To cover the negative aspect, i.e. the case when the Dominant Color option is not selected, I have developed another test which creates a demo vocabulary to simply store the labels, instead of the dominant color component. In this case, the file/{file_id}/similarcontent link displays the message “No items found”.

I have posted the patch covering the suggestions and tests on the issue queue to be reviewed by my mentors. Once they review it, I would work on it further, if required.

Thursday 14 July 2016

GSOC 2016- Detection of image files and filling its Alt Text field- Week 7

TL;DR Previous week I had worked on detecting the emotion in the profile pictures of the users, and notifying them to change the image if they do not look happy. The work is under review by the mentors. Once it gets reviewed, I would resume it if it needs any changes. This week I have worked on filling the ‘Alt Text’ field of an image file based on any one of the method selected by the end user- Label Detection, Landmark Detection, Logo Detection and Optical Character Detection.


Last week, I had worked on implementing the Face Detection feature in the Google Vision API module. The code is currently under the review by the mentors. Once, they review it, I would develop further on it if it requires any changes.


The Google Cloud Vision API provides the features to detect popular landmarks in an image(Landmark Detection), logos of popular brands(Logo Detection), texts within an image(Optical Character Detection), in addition to Label Detection. These features, though of less significance, are helpful in identifying an image. Hence, I have started working on implementing a new helpful case for the users- Filling of the Alternate Text field of an image file using these features.


The Alt Text field of the image file entity is modified to incorporate the options to fill the field using the features. The user may select any one of the four options to fill the Alt Text field of the image.


Coming to the technical aspect, I have made use of hook_form_BASE_FORM_ID_alter() to alter the Alternate Text field of the image file entity. I have modified the edit form of the Alt Text field to add four radio options, namely- Label Detection, Landmark Detection, Logo Detection and Optical Character Detection. The user may select any of the options and save the configuration. The Alternate Text field would be filled up accordingly.

Presently, the code is under the review by the mentors. Once it gets reviewed, I would make suggested changes, if required.

Wednesday 6 July 2016

GSOC 2016- Starting with Face Detection feature of Google Cloud Vision API- Week 6

TL;DR Previous week I had worked on grouping the contents based on the dominant color component in their images, if present. The work is under review of the mentors. And once, it gets reviewed, I would work further on that issue. Meanwhile, I have started developing and implementing the Emotion Detection feature of the Google Cloud Vision API. It would detect the emotion of the person in the profile picture uploaded, and if the person looks angry or unhappy, he would be notified thereof. This feature is especially important when building sites for professional purposes, as the facial looks matters a lot in such cases.


Last week, I had worked on implementing the Dominant Color Detection feature in the Google Vision API module. The code is currently under the review by the mentors. Once, they review it, I would develop further on it if it requires any changes.


Hence, meanwhile, I have started working on implementing a new feature Face Detection in an image. This feature gives us the location of the face in an image, and in addition, the emotions and expressions on the face.


I have used this feature to detect the emotion of the person in the profile picture uploaded by him/her. If the person does not seem happy in the image, he/she is notified thereof of their expressions. This is especially useful when the end users are developing a site for professional purposes, as in professional matters, expressions matters a lot.


Coming to the technical aspect, I have made use of hook_entity_bundle_field_info_alter() to alter the image fields, and check the emotions in the uploaded images. This function has been used, as we only want to implement this feature on the image fields. If the image is not a happy one, then appropriate message is displayed using drupal_set_message(). This feature also makes use of Constraints and Validators just like the Safe Search detection feature.
Presently, the code is under the review by the mentors.


In addition to the implementation of Face Detection, I also worked on expanding the tests of the Safe Search Detection feature of the Google Vision API module to test other entities as well, in addition to the nodes. I have expanded the tests to test the safe search constraint on the comment entity as well.
This requires the creation of a dummy comment type, adding an image field to the comment type, and attaching the comment to the content type.
The image field contains the safe search as the constraint on it. This test is basically similar to the tests present in the module for the node entity.
The code is under review by the mentors and would soon be committed to the module.
For reference on how to create dummy comment types and attaching it to the content types, the CommentTestBase class is very helpful.

Wednesday 29 June 2016

GSOC 2016- Display all the contents sharing same Dominant Colors in their images as of the current content- Week 5

TL;DR The safe search constraint feature is now committed to the module along with proper web tests. So, this week I started off with a new feature offered by the Google Cloud Vision API- “Image Properties Detection”. It detects various properties and attributes of an image, namely, the RGB components, pixel fraction and score. I have worked on to detect the dominant component in the image present in any content, and display all the contents sharing similar dominant color. It is pretty much like what we see on the e-commerce sites.


Previous week I had worked on writing web tests for the safe search constraint validation on the image fields. This feature is now committed in the module Google Vision API.


This week I have worked on implementing another feature provided by the Google Cloud Vision API, namely, Image Properties Detection. This feature detects the color components of red, green and blue colors in the images along with their pixel fractions and scores.
I have used this feature to determine the dominant color component (i.e. red, blue or green) in the image, and to display all those contents which have the same dominant color in their images.


I have developed the code which creates a new route- /node/{nid}/relatedcontent to display the related contents in the form of a list. This concept makes use of Controllers and Routing System of Drupal 8. The Controller class is extended to render the output of our page in the format we require. The contents are displayed in the form of list with the links to their respective nodes, and are named by their titles.


In addition to the grouping of similar contents, the colors are also stored in the form of taxonomy terms under a taxonomy vocabulary programmatically generated under the name Dominant Colors.


This issue is still under progress, and requires little modification. I need to add the link to the new route in each of the nodes, so as to  get a better interface to access those contents. Henceforth, I will put this patch for review.

A very simple example of creating routes and controllers in your module can be found here.