Friday 13 March 2015

OAuth 2.0

 Most of you may have used Facebook for logging into some websites for example, www.gaana.com. As user is authenticated by FB, other websites don’t think of authenticating the user again. Instead, gets user’s information from FB. A consent screen is shown to the user, where he can accept/reject the permissions asked by the third party(www.gaana.com) Application using say FB. These permissions are for accessing the information from FB in read only, write only or both modes. This set of permissions depend upon the third party Application requirements.
If you are a new to the terminologies like OAuth or OpenId, then many questions may be going on in your mind. What is OAuth? Why OAuth?
What is OAuth?
The OAuth 2.0 is an authorization protocol that enables the third party application to obtain access to the resources on resource owner’s behalf. The resources can be anything like services, files or some information.
Why OAuth?Consider a scenario where third-party applications are required to store the resource owner's credentials for future in its database. As users have to store their credentials, Third-party applications gain overly broad access to the resource owner's protected resources, leaving resource owners without any ability to restrict duration or access to a limited subset of resources. So, instead of storing these credentials, can we have something that can be used for giving limited access to the resources of the users, so that no one can touch the information that user never intends to share?
And the answer is Yes. OAuth is the one of the best protocol for this purpose.
OAuth 2.0 Grant Types
  • Authorization Code: Auth Code grant type is used to obtain the Access and Refresh Tokens. Refer below figure and description for more information.
  • Implicit Grant: It does not support issuance of Refresh Tokens.
  • Resource Owner Password Credentials: It is suitable in some scenarios where resource owner has a trust relationship with the third party.
  • Authorization Request and Response: The client authentication is used as the authorization grant, no additional authorization request is needed.
  • let’s see in depth about the messages that are shared between a client and the different authentication/authorization servers for Authorization code grant type.
    In OAuth, the client requests access to resources controlled by the resource owner and hosted by the resource server, and is issued a different set of credentials than those of the resource owner. Instead of using resource owner’s credentials , third party makes use of Access Tokens for accessing the protected resources. This Access Token is nothing but a string denoting a specific scope, lifetime and other access attributes. Access Tokens are issued to third party clients by an authorization server with approval of resource owner. Now, lets see how this protocol works in details and consider the following block diagram.
     
    OAuthJPG6
    Fig 1: OAuth 2.0 Flow
    The flow illustrated in the diagram includes the following the steps:
     
    The Authorization and Authentication process of OAuth consists of exchange of two messages between user and the server.
    (A) User authenticates itself by sending the Client Id to the Authorization Server and after successful authentication gets the Auth Code or Authorization Grant. This Auth code is expires within some minutes say 5 mins. This expiry time depends upon the authorization server.
     
    (B) The Authorization Grant obtained in the first step along with Client Secret is used in order to obtain Access and Refresh Tokens. Out of these two tokens, Access token has validity of one hour(again expiry time is server dependent) and Refresh token has unlimited validity. For example, in case of Google Analytics, client can generate infinite no. of Refresh Tokens but when 21st token is generated 1st expires and similarly, for 22nd, 2nd expires and so on. Refresh token is used in order to obtain new access token, if expired or client wants to generated the newer one.
     
    (C) Client makes request for accessing the protected resource to the resource server by presenting the Access token.
    (D) The resource server validates the request and if found to be valid, serves the request.
    (E) The steps (C) & (D) repeat until the Access token expires.
    (F) Resource server serves the requests for the protected resource till the access token remains valid. If token found out to be invalid, it sends the error message.
    (G) If Access token expires or client knows that his/her token is going to expire, he/she can request for the newer Access token by presenting refresh token.
    (H) In this step, client gets the requested access token after authorization server authenticates the client and validates the refresh token. In the same response, he can get the new Refresh token also, if requested. As mentioned, client can generate infinite no. of refresh tokens but only last ‘N’ tokens will be valid and the value of this ‘N’ depends upon the authorization server.
    I am sure most of you have some doubts:  
  • What is the need of two tokens i.e. Access and Refresh?
  • Why two step process?
  • Need of Two Tokens
    One of the reason of increasing popularity of OAuth 2.0 over OpenId is two tokens. In case of OpenId, each user can access the protected resources by having login with OpenId credentials. If some interceptor gets these credentials, then it can access the protected resources. To overcome this problem, OAuth 2.0 has two tokens. Most of the resource requests contain only access token, so if some interceptor gets this token, it can use for maximum of one hour(say) and after that access token gets expired. So, for providing limited time access to the resources, access token is introduced and for generating the new access token, refresh token can be utilized. Hence, OAuth 2.0 is more secure than OpenId.
    Need of Two Steps
    Consider a scenario where clients are entirely built on JavaScript and no Java or server side code and running in resource owner’s browser. Now, any application developer must not want his client Secret along with the tokens to be visible to the user. But in this scenario, its not possible to hide these secrets from the resource owner. So, only the first step known as Implicit Grant is used in this scenario and second one is ditched. So, only to achieve simplicity two steps are needed.
    I think this post may be helpful for the developers working on third party authorization/authentication.
    Please, do write the comments of appreciations or doubts.
    Stay tuned for the next post till then eNjoy coding. Smile with tongue out Surprised smile Winking smile
    For more information:
  • Monday 26 January 2015

    External Sort

     

    I guess most of the people have implemented sorting algorithms like merge sort, quick sort, etc but they may not be aware of External Sort. So, What does this external sort mean?

    In order to understand this concept, we will introduce a scenario. Suppose, we have to sort a file of size far larger than that of physical memory (RAM). To tackle this problem, firstly, increase the physical memory or secondly, try to introduce some other algorithm. Yes, second option is the better one but how?

    This problem can be solved with this external merge sort algorithm. Let’s have a look of this algorithm.

    This algorithm consists of two phases:

    1) Creation of sorted sublists

    2) Merging sublists to form a single file

    Let’s start developing the foundation of the solution. Let the size of the main memory be ‘m’ buffers and that of file be ‘n’ such that n >> m. We can sort the file that has size less than or equal to the size of physical memory by making use of any known sorting algorithm. Here, sorting algorithms like merge sort cannot be used because such kind of sorting algorithm requires more memory for storing intermediate results.

    Phase I:

    Get the first m buffers (size of RAM) of the file into the main memory and sort using any internal sorting algorithm. Now, after sorting dump this onto the disk in a file called as sublist. Repeat this process until the whole file gets sorted into sublists. Obviously, we will get n/m no. of sublists. All the sublists will be nothing but sorted files of size equal to that of main memory.

    Phase II:

    So, we will get n/m = t (say) no. of sorted sublists.

    Now, we have to merge these sorted sublists in order to create a single sorted file. One thing should be noted here, these sublists are not mutually sorted, don’t worry; you will understand the whole concept from the example.

    Divide the main memory (logically) into (t+1) buffers as we have t sorted sublists and one extra buffer for storing the results of the merge operation. Now, the size of the buffer (s) will be equal to m/ (t+1). Read the s bytes from each sublists into the main memory and compare the top values from each buffer and put the lesser value into the output buffer and remove that value from its respective buffer. Fill the buffer when it becomes empty from its respective sublists. As soon as the output buffer gets full, dump it into an output file and this file will be the final sorted file that we wanted. All the dumps will be performed on the same output file in phase II.

    For example,

    Suppose we have a file containing the following nos.

    1, 2, 3, 5, 4, 3, 5, 1, 3, 4, 5, 8, 3, 5, 1, 8, 3, 7, 4, 5 and we can store max. of 5 nos. at any instant of time in the main memory i.e. size of the main memory = 5 nos.

    Now, we have to sort these nos. using External Sort mechanism.

    Phase I:

    Take first 5 nos. and put them in main memory, sort them and put them in a sublist.

    Sublist 1: 1 2 3 4 5

    Similarly, other sublists will be:

    Sublist 2: 1 3 3 4 5

    Sublist 3: 1 3 3 5 8

    Sublist 4: 3 4 5 7 8

    Phase II:

    Now, we have 4 sublists, so 4 input plus 1 output buffers.

    The working of this phase can be understood very easily from the above diagrams.

    Step 1:

    clip_image002_thumb5

    Initially, one no. from each sublist is loaded into physical memory (as each buffer can store only one no.) and the smallest no. is stored into the output buffer. This buffer is emptied/dumped into a file (output file).

    Similarly,

    Step 2:

    clip_image004_thumb6

    Step 3:

     clip_image006_thumb5

    Step 4:

    clip_image008_thumb5

    Step 5:

    clip_image010_thumb5

    In the similar fashion, we can go ahead until all the sublists become empty and at last the output file will contain sorted nos.

    I hope, you can sort the files of huge sizes using external sorting algorithm that we have seen.

    Please don’t forget to write the comments of appreciation or doubts.