Wednesday 24 May 2023

ConcurrentModificationException and Streams - what not to do

I was wondering what shenanigans I can get up to with streams and mutating the underlying collection, and what a terrible idea this is.

As we all know, it's a really bad idea to mutate a list on which you are iterating, and Java makes certain this fails with a ConcurrentModificationException.

It all starts with the original problem, a for-loop that also mutates the list on which is iterates:

  @Test(expectedExceptions = ConcurrentModificationException.class)
  public void simpleTest() {
    List<Long> longs = new ArrayList<>(List.of(12L, 13L, 16L, 20L, 55L, -5L, 12L, 5L, 100L, 1000L));
    for (Long aLong : longs) {
      if (aLong > 50L || aLong < 0L) {
        longs.remove(aLong);
      }
    }
    assertThat(longs).isEqualTo(List.of(12L, 13L, 16L, 20L, 12L, 5L));
  }

Nowadays, we can use the forEach method on a collection, turning it into something like this with exactly the same problem:

  @Test(expectedExceptions = ConcurrentModificationException.class)
  public void forEachTest() {
    List<Long> longs = new ArrayList<>(List.of(12L, 13L, 16L, 20L, 55L, -5L, 12L, 5L, 100L, 1000L));
    longs.forEach(t -> {
      if (t > 50L || t < 0L) {
        longs.remove(t);
      }
    });
    assertThat(longs).isEqualTo(List.of(12L, 13L, 16L, 20L, 12L, 5L));
  }

If you are using a stream, the same problem occurs, as the stream refers to the underlying collection:

  @Test(expectedExceptions = ConcurrentModificationException.class)
  public void streamTest() {
    List<Long> longs = new ArrayList<>(List.of(12L, 13L, 16L, 20L, 55L, -5L, 12L, 5L, 100L, 1000L));
    longs.stream()
        .filter(Objects::nonNull)
        .filter(t -> t > 50L || t < 0L)
        .forEach(longs::remove);
    assertThat(longs).isEqualTo(List.of(12L, 13L, 16L, 20L, 12L, 5L));
  }

One way, though admittedly not a great way, to fix this, is to transfer it to a list first. Like so:

  @Test
  public void streamWithToListTest() {
    List<Long> longs = new ArrayList<>(List.of(12L, 13L, 16L, 20L, 55L, -5L, 12L, 5L, 100L, 1000L));
    longs.stream()
        .filter(t -> t > 50L || t < 0L)
        .collect(Collectors.toList())
        .forEach(longs::remove);
    assertThat(longs).isEqualTo(List.of(12L, 13L, 16L, 20L, 12L, 5L));
  }

But my colleague at work mentioned the new removeIf function, which seems to work particularly well:

  @Test
  public void removeIfTest() {
    List<Long> longs = new ArrayList<>(List.of(12L, 13L, 16L, 20L, 55L, -5L, 12L, 5L, 100L, 1000L));
    longs.removeIf(t -> t > 50L || t < 0L);
    assertThat(longs).isEqualTo(List.of(12L, 13L, 16L, 20L, 12L, 5L));
  }

Of course, the whole thing is patently ridiculous, because you might as well just filter unwanted elements out of the stream and create a new list from it.

But in some cases, where the "remove" is not as simple as chucking an element out of a collection (for example something as involved as, oh, I don't know, ImportantRestService.removeItem(Item item)), one has to find other ways to deal with it, which is where the second-to-last solution might come in handy.

I don't know. I'm open to suggestions.

Saturday 13 May 2023

Mounting and umounting my MDADM disk array

In my dmesg, when connecting my USB harddrives, I get the following messages:

[ 3455.138687] usb 1-1.5: new high-speed USB device number 3 using ehci-pci
[ 3455.218450] usb 1-1.5: New USB device found, idVendor=1058, idProduct=1021, bcdDevice=20.21
[ 3455.218462] usb 1-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 3455.218468] usb 1-1.5: Product: Ext HDD 1021
[ 3455.218472] usb 1-1.5: Manufacturer: Western Digital
[ 3455.218476] usb 1-1.5: SerialNumber: 574D43315431373033343231
[ 3455.260880] usb-storage 1-1.5:1.0: USB Mass Storage device detected
[ 3455.261279] scsi host10: usb-storage 1-1.5:1.0
[ 3455.261525] usbcore: registered new interface driver usb-storage
[ 3455.267542] usbcore: registered new interface driver uas
[ 3455.287643] usb 1-1.6: new high-speed USB device number 4 using ehci-pci
[ 3455.367335] usb 1-1.6: New USB device found, idVendor=1058, idProduct=1021, bcdDevice=20.21
[ 3455.367347] usb 1-1.6: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 3455.367353] usb 1-1.6: Product: Ext HDD 1021
[ 3455.367357] usb 1-1.6: Manufacturer: Western Digital
[ 3455.367361] usb 1-1.6: SerialNumber: 574D43315432313138383334
[ 3455.367897] usb-storage 1-1.6:1.0: USB Mass Storage device detected
[ 3455.368175] scsi host11: usb-storage 1-1.6:1.0
[ 3456.301327] scsi 10:0:0:0: Direct-Access WD Ext HDD 1021 2021 PQ: 0 ANSI: 4
[ 3456.302077] sd 10:0:0:0: Attached scsi generic sg4 type 0
[ 3456.302737] sd 10:0:0:0: [sde] 3907024896 512-byte logical blocks: (2.00 TB/1.82 TiB)
[ 3456.304758] sd 10:0:0:0: [sde] Write Protect is off
[ 3456.304771] sd 10:0:0:0: [sde] Mode Sense: 17 00 10 08
[ 3456.306754] sd 10:0:0:0: [sde] No Caching mode page found
[ 3456.306763] sd 10:0:0:0: [sde] Assuming drive cache: write through
[ 3456.335251] sde: sde1
[ 3456.335455] sd 10:0:0:0: [sde] Attached SCSI disk
[ 3456.428905] scsi 11:0:0:0: Direct-Access WD Ext HDD 1021 2021 PQ: 0 ANSI: 4
[ 3456.430291] sd 11:0:0:0: Attached scsi generic sg5 type 0
[ 3456.431366] sd 11:0:0:0: [sdf] 3907024896 512-byte logical blocks: (2.00 TB/1.82 TiB)
[ 3456.433375] sd 11:0:0:0: [sdf] Write Protect is off
[ 3456.433387] sd 11:0:0:0: [sdf] Mode Sense: 17 00 10 08
[ 3456.435502] sd 11:0:0:0: [sdf] No Caching mode page found
[ 3456.435515] sd 11:0:0:0: [sdf] Assuming drive cache: write through
[ 3456.457754] sdf: sdf1
[ 3456.457974] sd 11:0:0:0: [sdf] Attached SCSI disk

The partitions on those drives look like follows:

/dev/sdd1 2048 3907024895 3907022848 1.8T fd Linux raid autodetect
/dev/sde1 2048 3907024895 3907022848 1.8T fd Linux raid autodetect
/dev/sdf1 2048 3907024895 3907022848 1.8T fd Linux raid autodetect

Mounting said drives as a mirrored drive om /dev/md127:

mdadm --assemble /dev/md127 /dev/sdd1 /dev/sde1 /dev/sdf1
mdadm: /dev/md127 has been started with 3 drives.

To check the status:

mdadm --detail /dev/md127
/dev/md127:
           Version : 1.2
     Creation Time : Wed Mar  6 22:16:05 2013
        Raid Level : raid1
        Array Size : 1953380160 (1862.89 GiB 2000.26 GB)
     Used Dev Size : 1953380160 (1862.89 GiB 2000.26 GB)
      Raid Devices : 3
     Total Devices : 3
       Persistence : Superblock is persistent

       Update Time : Sat Mar 19 07:07:49 2022
             State : clean 
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : resync

              Name : micemouse:0
              UUID : ed4531c4:59c132b2:a6bfc3d1:6da3b928
            Events : 6587

    Number   Major   Minor   RaidDevice State
       4       8       65        0      active sync   /dev/sde1
       5       8       81        1      active sync   /dev/sdf1
       3       8       49        2      active sync   /dev/sdd1

It's also possible to use:

mdadm --assemble --scan

Adding a drive

Previously I had only two drives, but I added one as follows:

# mdadm --grow /dev/md127 --add /dev/sdd1 --raid-devices=3
mdadm: added /dev/sdd1
raid_disks for /dev/md127 set to 3
Every 2.0s: cat /proc/mdstat                  sherlock: Sat Nov 24 15:28:29 2018

Personalities : [raid1]
md127 : active raid1 sdd1[3] sdf1[2] sde1[1]
      1953380160 blocks super 1.2 [3/2] [UU_]
      [>....................]  recovery =  0.0% (1446912/1953380160) finish=957.
9min speed=33959K/sec

unused devices: <none>

That took some time.