Apache Commons Bridge For Scala

Motivation and preparation

Apache Commons It is a large and rich set of tool libraries. I first came into contact with this code base when I participated in a Java project a few years ago, and it is now a must-have tool for developing Java/Scala projects.

I am currently mainly using commons lang3 . This repository contains several extensions related to the Java Standard Library. For me, the StringUtils tool is particularly useful.

This is a huge type that contains more than 200 static methods and contains a large number of string common operations. For example, for Blank, Empty. Although most of these tool functions are not complex, they are safe and reliable to write and require considerable effort. Especially for null processing, the various boundary conditions need to be carefully covered. Almost all of Apache Commons'StringUtils are null-safe, and they can either safely invalidate them (for example, for functions that construct new strings, passing in null will safely return null, or for null as an operation parameter, safely return the original data), or they can give reasonable default logic. And fine-grained control conditions (such as sorting, typically given a default implementation, and the version of the collation can be controlled through the nullIsLess parameter).

This library's support for Java is very pragmatic, for example, the lang3 version can support all versions of Java over 1.8, and the lower branches can support earlier versions. Despite higher versions of Java, where string types String already comes with isBlank and isEmpty, the version of Apache ommons is still a safer and more practical choice. These tool functions, which support null, span a wide range of Java version differences, and provide consistent and complete rules, are of great value to application developers.

Although my recent work has focused on Scala, I still like to take full advantage of the resources of the Apache Commons repository, especially for Flink tasks running on cloud platforms, which use both lower Java and Scala versions and have no choice, such as the project I'm working on right now, which needs to run on Java 1.8 and Scala 2.11. The resources provided by the StringUtils tool are even more valuable. Of course, Aapache Commons is much more than StringUtils. But StringUtils, Text, Math, and IO are still most useful for my daily work.

So I had the idea of writing a more Scala-compliant encapsulation to make Scala easier and smoother to use.
Initially, I simply included an implicit class in my project code. Using it, several commonly used functions are encapsulated.

object Strings {
  implicit class StringOpt(x: String) {
    def isBlank: Boolean = StringUtils.isBlank(x)

    def isNotBlank: Boolean = StringUtils.isNotBlank(x)

    def isNoneBlank: Boolean = StringUtils.isNoneBlank(x)

    def isEmpty: Boolean = StringUtils.isEmpty(x)

    def isNotEmpty: Boolean = StringUtils.isNotEmpty(x)

    def isNoneEmpty: Boolean = StringUtils.isNoneEmpty(x)
  }
}

This is sufficient for my current project, and I can call these tool functions as string member methods:

// Yes, I know. isBlank is now the standard library. But this project is still using Java8
      refer = if (input("f0").isBlank) {
        None
      } else {
        Some(input("f0"))
      },

But as a generic library, it's too crude.

  • Because StringUtils'tool methods are static, it allows the data to be null itself. If you simply encapsulate the implicit class as a String type. Apparently not safe for null, such as StringUtils.isBlank(null) returns true, and my null.isBlank will only throw null pointer exceptions.
  • Most importantly, over two hundred functions, I've only encapsulated a few of the currently used
  • I don't need to consider Scala version compatibility for this code, but if it's a generic tool library, I'd certainly like to be able to work on my scala projects, from scala 2.11 (Scala used by Flink in Ali Cloud) to 2.12 (Scala used by Figaro) to 2.13 (the highest version supported by play so far) or even 3 (Jaskell Dotty uses scala 3). I have limited personal resources and of course I hope that the less work I do during this transplantation, the better.

I started writing this encapsulation library at the weekend. I thought I would rack up quickly and go into a mechanized transplantation process. The result was two days of tortuosing the implementation of the basic structure.

Thanks to the enthusiastic guidance from colleagues in the Scala community, the current functionality is far beyond my expectations - and far beyond my capabilities, written entirely by the teacher.

Although only a few functions have been implemented, it's true that next I just need to mechanize the encapsulation -- and write documentation notes for the encapsulated version based on the original version. These comments do not seem to me to be less valuable than the code itself. There is no room for any cleverness in these tasks, just a function to implement them.

Currently this encapsulation implements:

  • Type security, code base itself is encapsulated based on Option[String], taking full advantage of Scala's own functional programming capabilities
  • String variables (including string variables with null values) are implicitly supported for conversion to Option[String], so in use, those from json, databases, bizarre data pipelines, and possibly empty string variables can be used directly in most cases. Some advanced Scala type technologies are used here, thank you again for your guidance from your community peers!
  • High consistency, except that Scala 2 and Scala 3 have implicit rules implemented by themselves, all code is common. In fact, after a series of compatibility modifications, Scala 2 and Scala have only one function call (through implicity/summon injection type conversion rules) with a different name, and almost the same other. Initially, I wanted Scala 3 to use a new given/extension instead of the traditional implicit class/def/var. However, during encoding, it is found that extensions cannot define member variables, given has an unrecognized problem in the import implicit process, and ultimately key functions almost all use the same code. Even in the future, this difference may be completely eliminated, and Scala 2 and Scala will be merged into a completely consistent implementation.

Key Functions

When I was discussing the design of this library with my peers, a friend made a critical suggestion to implement the main functionality based on Option[String]. There was a problem when you needed to provide an extension for Option. This type itself is sealed and cannot be extended beyond its source file. A friend's suggestion is to define a new type as a wrapper

trait ToStringOpt[-T] extends TypeMapping[T, Option[String]] {
  override def apply(i: T): Option[String]
}

object ToStringOpt {
  def apply[S](func: S => Option[String]): ToStringOpt[S] = (i: S) => func(i)

}

This type simply relies on another very miniature definition, TypeMapping. Used to specify type conversion mappings

trait TypeMapping[-I, +O] {
  def apply(i: I): O
}

object TypeMapping {
  def apply[S, T](func: S => T): TypeMapping[S, T] = func(_)
}

ToStringOpt can then convert to Option[String] and String using two implicit conversion rules

    implicit val stringMappingImplicit: ToStringOpt[String] = ToStringOpt(i => Option(i))
    implicit val stringOptMappingImplicit: ToStringOpt[Option[String]] = ToStringOpt(identity)

These two lines of code are encapsulated in commons.lang3.scala.StringUtils. The only difference between Scala 2 and Scala 3 is rule injection in implicit class. Here's the Scala 2 version:

object StringUtils {

  object bridge {
    import commons.lang3.scala.ToStringOpt

    implicit val stringMappingImplicit: ToStringOpt[String] = ToStringOpt(i => Option(i))
    implicit val stringOptMappingImplicit: ToStringOpt[Option[String]] = ToStringOpt(identity)

    implicit class StringOptExt[T: ToStringOpt](x: T)  {
// Scala 2 uses implicitly 
      private def optFunc: ToStringOpt[T] = implicitly
      def strOpt: Option[String] = optFunc(x)
      val ops = new StringCommons[T](x)

    }

  }

This is the Scala 3 version

object StringUtils {

  object bridge {

    import commons.lang3.scala.ToStringOpt

    implicit val stringMappingImplicit: ToStringOpt[String] = ToStringOpt(i => Option.apply[String](i))
    implicit val stringOptMappingImplicit: ToStringOpt[Option[String]] = ToStringOpt(identity)

    import org.apache.commons.lang3.{StringUtils => Strings}

    implicit class StringOptExt[T: ToStringOpt](x: T) {
// Scala 3 uses summon
      private def optFunc: ToStringOpt[T] = summon

      def strOpt: Option[String] = optFunc(x)

      val ops: StringCommons[T] = new StringCommons(x)
    }

  }
}

Extension methods are not implemented directly in implicit/extended types because, in practice, I've encountered some cases where extension methods have the same name as built-in methods of the original type, specifically the contains method, if I implement a method of an implicit class

    implicit class StringOptExt[T: ToStringOpt](x: T) {

      private def optFunc: ToStringOpt[T] = summon

      def strOpt: Option[String] = optFunc(x)

      val ops: StringCommons[T] = new StringCommons(x)
    }

The compiler finds its own contains method for the String type and throws a type error at run time. There are also methods that first match members of the StringOps type built into scala. So I've chosen a less sophisticated but less demanding and intuitive way to implement a single type, put all the functional implementations there, and define it as a member variable of the implicit type StringOptExt. This is the StringCommons type.

class StringCommons[T: ToStringOpt](value: T) {
  import commons.lang3.scala.StringUtils.bridge.StringOptExt
  def contains[To: ToStringOpt](seq: To): Boolean = Strings.contains(value.strOpt.orNull, seq.strOpt.orNull)

  def contains(searchChar: Char): Boolean = Strings.contains(value.strOpt.orNull, searchChar)

  def abbreviate(maxWidth: Int): String = Strings.abbreviate(value.strOpt.orNull, maxWidth)

  def abbreviate(offset: Int, maxWidth: Int): String = Strings.abbreviate(value.strOpt.orNull, offset, maxWidth)
// There are many ways to implement them with the same rules, which are not listed one by one.
}

I haven't published this project yet, and I'm ready to migrate at least all the StringUtils methods - complete the documentation and testing, and then publish the first version. It is also possible to publish an unofficial, incomplete version first, depending on your work needs.
Because it is a simple encapsulation, we can still use the original Commons library directly if we use features it does not cover or where it is not convenient. For example, this encapsulation recognizes string variables with a value of null:

  import commons.lang3.scala.StringUtils.bridge._

  "Strings" should "test contains operators" in  {
// Here you can pass the test
    val nullStr:String = null
    nullStr.ops.contains(' ') should be (false)
    "".ops.contains(' ') should be (false)

    "".ops.contains(nullStr) should be (false)
    nullStr.ops.contains(nullStr) should be (false)

    "abc".ops.contains('a') should be (true)
    "abc".ops.contains('b') should be (true)
    "abc".ops.contains('c') should be (true)

However, null literals are not recognized

// Add the following import line to your code to use all of its functionality
  import commons.lang3.scala.StringUtils.bridge._

  "Strings" should "test contains operators" in  {
// No doubt, an exception will be thrown here
    null.ops.contains(' ') should be (false)
    "".ops.contains(' ') should be (false)

    "".ops.contains(nullStr) should be (false)
    nullStr.ops.contains(nullStr) should be (false)
//...

. However, we hardly use null directly unless it's a test -- especially since Scala already supports the Option type very well. If you do encounter this scenario, use the original static method, which already takes these issues into account very carefully.
This library is very simple to use because there is no dependency configuration available before it is published, and only one import line is needed in the code as in the previous example.
Finally, we look at how the bridge library encapsulates two contains methods through complete test code:

package commons.lang3.scala

import org.scalatest.flatspec.AnyFlatSpec
import org.scalatest.matchers.should.Matchers

class StringUtilsSpec extends AnyFlatSpec with Matchers{

  import commons.lang3.scala.StringUtils.bridge._

  "Strings" should "test contains operators" in  {

    val nullStr:String = null
    nullStr.ops.contains(' ') should be (false)
    "".ops.contains(' ') should be (false)

    "".ops.contains(nullStr) should be (false)
    nullStr.ops.contains(nullStr) should be (false)

    "abc".ops.contains('a') should be (true)
    "abc".ops.contains('b') should be (true)
    "abc".ops.contains('c') should be (true)
    "abc".ops.contains('z') should be (false)
  }

  "Options" should "test contains operators for option string" in {
    val noneStr: Option[String] = None
    noneStr.ops.contains(' ') should be(false)
    Some("").ops.contains(' ') should be(false)

    Some("").ops.contains(None) should be(false)

    None.ops.contains(None) should be(false)

    Some("abc").ops.contains('a') should be(true)
    Some("abc").ops.contains('b') should be(true)
    Some("abc").ops.contains('c') should be(true)
    Some("abc").ops.contains('z') should be(false)

  }
}

Tags: Java Scala Apache csdn

Posted by KindredHyperion on Mon, 29 Aug 2022 02:59:05 +0300